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6  Summary 


CHAPTER  1 


The  Bispectrum  in  Multiple  Perspectives 


1-1  Intuitive  Understanding  of  the  Bispectrum  is  Possible. 

The  bispectrum  is  a  potentially  valuable  tool  for  analyzing  any  nonlinear 
process.  Moreover,  it  has  been  applied  to  a  wide  variety  of  nonlinear  problems  in 
numerous  disciplines.  [1.1-1]  However,  only  recently  has  it  gained  a  substantial 
measure  of  popularity. 

There  are  at  least  four  possible  reasons  as  to  why  the  bispectrum  is  only 
now  coming  into  its  own.  First,  the  bispectrum  is  computationally  relatively 
expensive.  Second,  the  additional  knowledge  which  the  bispectrum  supplies  may 
be  too  limited  to  justify  the  extra  effort  required.  Third,  physical  and  intuitive 
understanding  of  the  bispectrum  are  not  widespread.  Fourth,  the  mathematics 
of  the  bispectrum  are  sometimes  considered  excessively  complicated. 

The  computational  expense  of  the  bispectrum  is  no  longer  a  significant 
problem  as  powerful  machines  have  become  commonplace.  The  impact  of  the 
second  reason  is  rather  difficult  to  assess:  it  is  true  that  many  investigations 
do  not  require  the  additional  insight  afforded  by  the  bispectrum  and  further,  in 
many  cases,  this  additional  information  is  small.  However,  it  is  not  easy  to  know 
in  advance  that  the  information  gained  will  or  will  not  be  of  value.  Moreover, 
in  certain  circumstances  any  information  may  be  of  such  importance  that  the 
possibility  of  its  acquisition  should  not  be  ruled  out  at  the  outset.  Finally,  the 
literature  of  the  applied  bispectrum  demonstrates  its  usefulness  —  at  least  to  the 
specific  problems  studied. 

This  report  focuses  on  the  third  and  fourth  impediments  to  the  more 
widespread  use  of  the  bispectrum:  lack  of  physical,  intuitive  and  mathematical 
understanding  of  the  bispectrum. 

The  remainder  of  this  chapter  will  attempt  to  foster  the  reader’s  intuitive 
and  physical  understanding  of  the  bispectrum  by  presenting  the  bispectrum  in 
different  perspectives.  The  hope  is  that  of  the  following  assortment  of  ways  of 


wiw^iaBmumuBUi  many  wbbow  gwwwA" jqngpgav  w  u*  wgKvuvx  m  v  ^rwonqpqi  rj  tct  r jtrj<  ,v  -j>  -*. 


regarding  the  bispectrum,  at  least  one  will  “make  sense”  to  the  reader.  More 
optimistically,  these  viewpoints  will  reinforce  one  another  to  provide  a  satisfactory 
intuitive  picture.  These  sections  do  not  contain  detailed  mathematics  and  the 
reader  is  encouraged  to  avoid  any  perspective  which  seems  unnatural  or  unduly 
complicated.  For  this  chapter,  no  results  are  proved  and  the  mathematics  are 
partially  ornamental  (and  should  be  studied  only  if  helpful).  (Proofs  can  be  found 
in  the  references,  if  desired.)  However,  more  mathematical  detail  is  available  in 
the  following  chapters. 

The  basic  mathematics  of  the  bispectrum  is  described  in  general  in  Chap¬ 
ter  2  and  a  specific  model  designed  to  reinforce  this  mathematical  understanding 
is  presented  in  Chapter  3.  Practical  considerations  such  as  how  to  estimate  the 
bispectrum  of  a  time  series  are  dealt  with  in  Chapter  4.  (Part  of  the  purpose  of 
this  report  is  to  make  the  bispectrum  a  more  useful  tool,  so  such  considerations 
need  to  be  included.)  The  fifth  chapter  presents  a  detailed  worked-out  example 
and  uses  the  model  presented  in  Chapter  3  and  the  statistical  methods  of  Chapter 
4.  The  final  chapter  presents  a  summary  of  the  report  and  an  appendix  presents 
a  catalog  of  time  series  with  corresponding  bispectra. 

The  bispectrum  comes  in  many  varieties  —  deterministic  or  stochastic, 
discrete  or  continuous  time,  and  with  or  without  the  assumption  of  stationarity. 
The  bulk  of  the  literature  considers  the  stationary  stochastic  case  and  this  report 
shall  dwell  primarily  on  this  case  as  well.  Moreover,  time  shall  in  general  be  taken 
to  be  continuous. 

1-2  Bispectrum  =  Double  Fourier  Transform  of  Third  Order  Cumu- 
lant  (Definition). 

The  power  spectrum  P(cu)  of  a  continuous  time,  stationary  stochastic 
time  series  x(t)  is  usually  defined  as  the  Fourier  transform  of  the  second-order 
covariance  function  c2(t).  Explicitly  one  has 


and 


C2{t)  =  (x{t)x(t  +  t)) 
P(w)  =  /  c2(t)  e~,UT  dr, 

J  -OO 


(1-2.1) 


(1-2.2) 


where  the  angle  brackets  denote  ensemble  average  and  this  average  is  independent 
of  the  time  t  because  of  the  assumed  stationarity. 

The  bispectrum  B(wi,u;2)  is  defined  by  analogy  to  the  power  spectrum 
[1.2-1].  It  is  the  (double)  Fourier  transform  of  the  third-order  covariance  function 
C3(n  ,r2). 
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Explicitly, 

C3(ti,t2)  =  (x(<)x(*  + T,)x(t  +  r2)),  and  (1-2-3) 

B(Wl,u*)  =  r  r  c3(r1,r2)e-«w'^-^fa>dr1rfT2.  (1-2.4) 

J  —  oo  J  —  oo 

What  kind  of  insight  does  the  definition  provide  into  the  interpretation  of 
the  bispectrum?  Unfortunately,  precious  little  seems  to  be  the  answer.  Features  in 
the  bispectrum  correspond  to  effects  of  (triple)  correlation  in  the  original  signal. 
It  is  clear  that  the  bispectrum  is  zero  if  there  is  no  triple  correlation.  Using  one’s 
intuition  about  the  relationship  between  time  and  frequency  domain  quantities, 
it  is  clear  that  a  “spiky”  bispectrum  implies  a  rather  “flat”  triple  correlation  and 
therefore  a  time  series  possessing  intervals  of  slow  change.  Inversely,  a  repetitious 
series  of  spikes  in  the  time  domain  may  be  expected  to  correspond  to  flatter  peaks 
in  the  bispectrum.  Beyond  this  much,  the  definition  seems  not  to  provide  aid  to 
the  intuition. 

1-3  Bispectrum  =  Expectation  of  Product  of  Interacting  Frequency 
Components. 

If  one  transforms  the  original  signal  x(t)  into  the  frequency  domain  by 
defining  x(u>)  such  that 

x^  =  'hJ,  (i-3.1) 

then  it  is  well-known  (and  easy  to  prove)  that  the  power  spectrum  can  be  written 
directly  in  terms  of  the  frequency  domain  quantities 

P(w)  =  (x(u»)*(-w)>.  (1-3.2) 

Again  the  angle  brackets  denote  ensemble  average.  The  fact  that  the  two  fre¬ 
quencies  u>  and  —u>  sum  to  zero  is  a  direct  consequence  of  stationarity.  It  is  easy 
to  extend  the  proof  of  the  above  formula  to  apply  to  the  bispectrum. 

One  finds  that 

B(u;i,fa;2)  =  (x(u;1)x(u;2)x(— uq  -  a>2)).  (1-3.3) 

The  fact  that  the  three  frequencies  uq,  u>2,  and  — uq  —  u>2  sum  to  zero  is  a  direct 
consequence  of  stationarity. 

This  form  of  the  bispectrum  is  somewhat  helpful  to  the  intuition.  The 
bispectrum  can  only  be  nonzero  at  a  particular  pair  of  frequencies  (u>j,  u;2)  if  the 
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Figure  1-1  Localization  via  cross-spectrum. 


frequency  component  at  the  sum  of  these  frequencies  (x(uq  +  u>2))  is  statistically 
dependent  on  the  product  (x(uq)  £(<^2))  these  frequency  components.  1  hat  is, 
the  two  original  frequencies  must  couple  through  their  sum  term.  Moreover  this  is 
the  only  ( three-wave )  coupling  possible  for  stationary  time  series.  (The  previous 
statement  is  true  provided  the  time  series  is  continuous-time,  as  has  been  assumed 
so  far.  For  a  discrete  time  series  it  is  only  necessary  for  the  three  frequencies  to 
sum  to  zero  modulo  2tc.) 

Thus  the  bispectrum  measures  coupling  between  frequencies  or,  equiv¬ 
alently,  it  measures  a  specific  phase  relation  between  frequency  components.  If 
there  was  no  such  relationship  between  the  phases  then  the  expectation  above 
would  vanish. 

1-4  Bispectrum  =  Cross-Spectrum  of  Frequency  Components  (Mac¬ 
Donald’s  Filter  Interpretation). 

In  an  interesting  MITRE  company  report  [1.4-2]  devoted  to  the  bi¬ 
spectrum,  Munk  discusses  the  possibility  of  using  the  bispectrum  to  localize  a 
target.  This  possibility  arises  because  of  the  following  interesting  analogy. 

Currently  cross-spectral  analysis  is  frequently  used  for  localization,  as 
shown  in  Figure  1-1.  One  has  two  detectors  and  consequently  two  signals.  These 
signals  can  be  crosscorrelated  and  analyzed  to  determine  phase  differences.  (  The 
cross-spectrum  can  be  computed  by  either  of  two  methods.  One  can  compute 
the  crosscorrelation  first  and  then  Fourier  transform  or  one  can  compute  the 
transforms  first  and  then  multiply.)  In  bispectral  analysis,  there  is  only  one 


signal  at  the  outset,  but  it  is  separated  into  two  frequency  components.  Munk 
suggests  that  one  can  then  heterodyne  both  components  to  zero  frequency  and 
finally  crosscorrelate.  (See  Figure  1-2.) 


PASS-BAND  HETERODYNE 


Figure  1-2  LOCALIZATION  VIA  BISPECTRUM  (MUNK’S  VERSION). 

Here  one  can  see  that  bispectral  analysis  looks  at  just  that  information 
which  ordinary  spectral  analysis  throws  away:  inter-frequency  phase  information. 
(The  “relative  phase”  or  “inter- frequency  phase”  of  the  two  frequencies  and  /2 
is  generally  defined  as  the  phase  part  of  the  bispectrum.) 

MacDonald  [1.4-3]  provides  a  formal  treatment  of  the  above  argument. 
A  condensed  version  of  his  argument  is  presented  below  and  may  be  skipped 
if  desired.  MacDonald’s  treatment  actually  leads  to  the  representation  shown 
in  Figure  1-3  which  is  slightly  different  from  Munk’s  version.  In  fact,  there  is 
no  explicit  treatment  of  the  bispectrum  which  corresponds  exactly  to  Munk’s 
description  and  it  would  be  interesting  to  try  to  construct  one.  (This  might  make 
a  valuable  mini-research  project  for  the  reader.) 

Start  with  a  continuous  time,  stationary,  ergodic,  zero  mean,  stochastic 
time  series  x(t).  Imagine  that  one  can  construct  ideal  passband  filters  with  center 
frequency  /j  and  bandwidth  A.  If  x(t)  is  input  to  the  passband  filter,  denote  the 
filter  output  by  x±  (<).  Now  take  the  input  time  series  x(t)  and  “fan  out”  into  three 
copies.  Send  one  copy  into  a  passband  filter  centered  at  /(.  Send  another  copy 
into  a  passband  filter  centered  at  /2.  Send  the  final  copy  into  a  passband  filter 
centered  at  f$  =  fi  +  /2.  Let  all  three  filters  have  bandwidth  A.  Now  multiply  the 
outputs  of  all  three  filters  together.  In  a  sense,  the  fact  that  the  third  frequency  is 
the  sum  of  the  other  two  allows  one  to  think  of  this  multiplication  as  heterodyning 
both  signals  to  zero  and  crosscorrelating  in  the  same  step.  (This  is  the  part  that 
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Figure  1-3  Localization  via  bispectrum  (MacDonald’s  version). 

needs  work  to  give  a  completely  explicit  meaning  to  Munk’s  interpretation.)  Take 
the  output  of  the  multiplier  and  time‘average.  Finally,  multiply  the  output  of  the 
“time‘averager”by  2/(3A2).  The  result  is  the  real  part  of  the  bispectrum  of  x(t) 
evaluated  at  /A,  y2-  If  the  imaginary  part  is  desired,  it  is  only  necessary  to  change 
the  f3  passband  filter  to  include  a  uniform  7r/2  phase  shift. 

This  condensed  version  of  MacDonald’s  treatment  may  be  viewed  as  a 
signal  processing  or  analog  explanation  of  the  bispectrum  (in  contrast  to  the 
digital  explanation  of  1.3). 

1-5  Bispectrum  =  Nonlinear  Coupling  of  Product  of  Interacting  Fre¬ 
quency  Components. 

This  action  shall  relate  the  bispectrum  to  two  models  involving  nonlin¬ 
ear  coupling  of  waves.  The  first  model  consists  of  a  simple  quadratic  nonlinearity. 
The  second  involves  a  physically  realistic  nonlinear  evolution  equation.  Both 
models  are  presented  by  E.  J.  Powers  and  his  co-authors  [1.5-1,  1.5-2]. 

Establishing  a  suitable  framework  for  developing  both  applications  is  first 
on  the  agenda.  Some  noisy  quantity  of  interest  (call  it  <p  and  for  simplicity  take  it 
to  be  scalar)  propagates  in  some  uniform  medium.  For  simplicity  of  exposition  <f> 
will  be  assumed  to  vary  significantly  in  only  one  dimension,  e.g.,  the  x  direction. 
Following  the  usual  procedure,  Fourier  decomposition  of  $  is  called  for.  Write 

<f>(x,  t)  =  <f>w(x)  exp[ifc(u;)x  —  iwt\.  (1-5.1) 


The  quantity  k(u)  is  the  frequency  dependent  wave  number  and  characterizes  the 
medium  under  consideration.  Since  (j>  is  a  random  quantity,  so  is  the  <f>u{x). 


Interest  centers  in  both  models  on  the  behavior  in  time  of  the  values  of 
<f>  at  one  particular  point  xp  in  space,  the  probe  position.  To  expedite  matters, 
define  y(t)  to  equal  (f>(xp,t).  If  one  sets 

^M  =  <£wexp[ifc(u>)xp],  (1-5-2) 

then  Y  and  y  form  a  Fourier  transform  pair. 

The  power  spectrum  of  y,  as  usual,  is 

p(w)  =  (y'(w)y»)  (1-5.3) 

and  the  bispectrum  is 

B(W1,W2)  =  (y(w,)y(«^)y(u;i  +  m)-  (1-5.4) 

Both  models  will  be  simplified  in  that  only  three  waves  will  be  considered. 
These  waves  will  be  assigned  frequencies  u>,  uq,  and  lj 2,  where  cj  =  ajt  +  u2. 

With  the  framework  for  both  models  established,  one  can  now  consider 
the  first  model  [1.5-1].  In  this  model,  the  nonlinearity  is  taken  to  be  of  the  simple 
form 

Y(u)  =  AYMY(u2)  +  Y'.  (1-5.5) 

Here  V  is  some  quantity  which  contributes  to  T(u;)  but  is  statistically  indepen¬ 
dent  of  the  coupling  term.  A  represents  the  coupling  constant. 

Computing  the  power  spectrum  at  u>,  one  easily1  finds 

PM  =  |A|2(|rM)  km)|2)  +  (in2).  d-5-6) 

The  first  term  represents  the  contribution  of  the  wave-wave  coupling  interaction 
and  the  second  includes  everything  else. 

It  is  just  as  easy2  f  to  compute  the  bispectrum  at  uq,  u>2: 

BM,M  =  ^(in^i)yM)i2).  (i-5.7) 

One  sees  immediately  that  the  bispectrum  conveys  information  only  about  the 
coupling  interaction.  In  fact,  one  could  partially  normalize  the  bispectrum  so  as 
to  get  the  coupling  coefficient  from  experimental  data  as  follows: 

A  -  /,e  o') 

(irM)yM)i2)‘  ( " ' } 

’Proving  this  takes  only  a  few  lines  and  may  be  a  useful  exercise. 

2This  formula  is  also  very  easy  to  derive  and  might  serve  well  as  the  reader’s  first  “bispectral 
computation” . 
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In  the  second  model  [1.5-2],  a  realistic  evolution  equation  is  assumed, 

d<f>, 


dx 


=  (i6kx), 


(1-5.9) 


with  frequency  matching  (u>  =  uq-|-u;2),  possible  mismatch  in  wave  number  (char¬ 
acterized  by  <5*  =  k^  +kuf2—ku),  and  wave- wave  coupling  (with  coupling  coefficient 

V). 

This  type  of  equation  is  commonly  found  in  nonlinear  optics  and  in 
plasma  physics.  It  shows  how  a  wave  gains  or  loses  energy  due  to  interactions 
with  waves  of  different  frequencies  as  it  propagates  in  space. 

As  implied  earlier,  one  should  properly 

•  make  the  right  side  an  integral  over  all  uq  (taking  u;2  =  w  — 

•  let  V  be  dependent  on  uq  uq,  and  u>2,  and 

•  take  this  equation  to  be  true  for  all  ui. 

It  will  be  easier  (as  above)  to  interpret  the  above  equation  as  true  for 
some  fixed  u>  and  to  assume  that  for  the  wave  at  this  frequency  u>  there  is  one 
pair  of  frequencies  (uq  and  u;2)  that  dominates  the  integral.  In  particular  the 
frequency  dependence  of  V  can  be  omitted. 

In  terms  of  the  quantities  of  experimental  interest,  the  above  equation 

becomes 


dYu 

dx 


=  VK,n,  +  ifc(w)y(4 


(1-5.10) 


derive 


It  is  a  simple  exercise  (left  to  the  reader)  to  take  the  above  equation  and 


0P(uQ 

dx 


—  V  B(uq,  uq)  +  V*  B*(uq,  uq). 


(1-5.11) 


One  should  now  follow  Powers  and  his  co-authors  and  re-write  the  above 
equation  in  terms  of  strictly  real  quantities  VR,  Vj,  Bk,  and  B/,  where 

2V  =  Vr  +  iVj,  and  B  =  Br  +  iB,.  (1-5.12) 

This  gives  the  wonderful  equation 


0PM 

dx 


=  KrBb(«|,W2)  +  V/Bj((*q,uq). 


(1-5.13) 


This  equation  demonstrates  that  power  is  put  into  the  wave  at  frequency 
u>  (as  one  advances  in  the  x  direction)  due  to  coupling  with  the  waves  at  uq  and 
uq.  Further,  the  real  part  of  the  bispectrum  gives  the  amount  of  that  coupling 
due  to  the  real  part  of  the  coupling  coefficient  V  (and  similarly  for  the  imaginary 
part). 
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1-6  Bispectrum  =  Rate  of  Transfer  of  Energy. 


Since  the  bispectrum  measures  the  amount  of  coupling  between  two  fre¬ 
quencies  and  since  this  coupling  governs  the  rate  at  which  energy  is  exchanged 
between  these  frequencies,  it  follows  that  the  bispectrum  provides  information 
regarding  how  energy  moves  from  one  frequency  range  to  another.  This  has  been 
demonstrated  in  the  previous  section.  This  information  is  of  particular  interest  in 
at  least  two  disciplines:  the  study  of  turbulence  and  the  study  of  nuclear  fusion 
in  plasmas. 

This  section  shall  present  a  brief  exposition  of  the  theory  of  homogeneous 
turbulence.  Much  of  the  experimental  bispectral  literature  deals  with  this  problem 
because  the  bispectrum  is  so  intimately  related  to  the  quantities  of  experimental 
and  theoretical  interest.  If  one  did  not  know  better,  one  might  even  suspect  that 
this  entire  subject  was  invented  to  provide  an  application  for  bispectra.  In  a 
sense,  the  discussion  in  this  section  is  the  dua *  of  the  previous  section  in  that 
here  one  considers  energy  transfer  in  time  rather  than  in  space  and  that  here  the 
bispectrum  is  extended  to  functions  of  space  rather  than  time.  (The  reader  should 
feel  free  to  skim  the  remainder  of  this  section.  He/she  is  encouraged  not  to  skip 
it  entirely.) 

In  homogeneous  turbulence,  one  is  concerned  with  an  infinite  body  of 
fluid  and  its  random  motions  [1.6-1].  These  motions  are  taken  to  be  independent 
of  position  on  average,  and  hence  homogeneous.  The  fluid  is  taken  to  be  incom¬ 
pressible  and  can  be  characterized  by  its  (uniform,  i.e.,  position-independent) 
density  p  and  its  (uniform)  kinematic  viscosity  v.  This  is  of  course  a  vast  simplifi¬ 
cation  of  typical  situations.  However,  even  this  simple  problem  evades  mathemat¬ 
ical  solution.  Moreover,  one  can  experimentally  produce  a  good  approximation 
to  homogeneous  turbulence  by  passing  a  fluid  perpendicularly  through  an  array 
of  metal  bars  (“grid-generated  turbulence”)  [1.6-1,  1.6-2]. 

The  state  of  the  fluid  is  specified  by  providing  its  velocity  v  and  pressure 
p  as  functions  of  time  t  and  space  x.  The  fluid  motion  is  assumed  to  be  random 
so  that  one  is  primarily  concerned  with  expectations  and  especially  correlations 
in  time  and/or  in  space.  For  example,  one  may  desire  (vx(x[,  t)vx(x2,  t)vz(x3,  t)) 
or  (■ vx{x,ti)vx(x,t2 )). 

The  motion  of  the  fluid  is  governed  by  the  Navier-Stokes  equation, 

Du  =  — -Vp+  uV2u, 

P 

and  is  subject  to  the  incompressibility  constraint 


(1-6-1) 


(1-6.3) 


The  operator  D  is  the  kinematic  derivative  and  is  defined  as 

D  =£+*•*• 

The  requirement  of  homogeneity  (averages  are  independent  of  position)  supplies 
the  boundary  conditions  (in  an  unusual,  though  pleasant,  fashion). 

Turbulence  is  inherently  a  “multi-scale”  phenomenon.  It  is  caused  by 
interaction  of  phenomena  which  take  place  on  very  different  length  scales  (from 
clearly  visible  to  microscopic).  Thus  a  decomposition  of  the  motion  of  the  fluid 
velocity  in  terms  of  motions  at  different  length  scales  is  suggested.  One  is  led  to 
write 

v(x,t)  =  /  exp(ik  ■  x)dz(k,t).  (1-6.4) 

This  is  a  Cramer  or  Fourier-Stieltjes  representation  of  the  velocity  field. 
The  mathematics  (which  will  be  dealt  with  in  Chapter  3)  is  not  important  here: 
all  that  is  required  is  the  understanding  that  the  velocity  is  written  in  terms  of 
motions  which  are  on  varied  length  scales.  The  vector  k  specifies  the  scale  of 
the  motion  -  the  greater  the  magnitude  of  k  the  smaller  the  scale  (wavelength). 
The  integral  is  taken  over  all  possible  scales  and  is  evaluated  at  a  particular  time. 
This  makes  it  possible  to  ask  questions  such  as  what  portion  of  the  total  energy  of 
the  fluid  motion  is  due  to  motions  on  a  particular  length  scale  and  how  does  this 
fraction  evolve  with  time.  In  other  words,  one  can  trace  the  transfer  of  energy  in 
time  from  one  spatial  frequency  range  to  another. 

It  must  be  stressed  that  the  interest  is  in  spatial  frequencies  rather  than 
temporal  ones.  One  is  concerned  primarily  with  velocity  as  a  function  of  position 
at  a  particular  time  rather  than  with  velocity  at  a  particular  place  as  a  function 
of  time.  However,  this  fact  makes  no  difference  to  the  mathematics.  In  par¬ 
ticular,  the  homogeneity  assumption  is  the  spatial  analog  of  the  requirement  of 
stationarity,  so  the  “space”  series  have  the  usual  properties  of  time  series.  Fourier 
transforming  the  Navier-Stokes  equation  and  simplifying  gives  an  equation  for  the 
dynamics  of  spectral  energy  transfer, 

A  1 

Jt  [£*»»(M)]  =  J  T(k,k',t)dk'  -vPQuik,  t).  (1-6.5) 

In  this  equation,  $nn(k,t)  is  the  spectral  energy  density  (at  a  particular 
time),  T  is  the  energy  transfer  function,  and  k  is  the  magnitude  of  k.  If  T  is 
known  then  one  can  say  how  energy  in  a  particular  (spatial)  frequency  range 
changes  in  time  due  to  transfer  from  motions  at  other  (spatial)  frequencies. 


The  spectral  energy  density  $  is  defined  in  terms  of  the  velocity  compo¬ 
nents  as 

t)dkvdk2dkz  =  (dz*(k,  t)dzj{k ,  <)).  (1-6.6) 

As  one  might  suspect  from  the  introduction  to  this  section,  the  energy  trans¬ 
fer  function  T  is  simply  expressed  in  terms  of  the  (spatial)  bispectrum3  of  the 
velocity-field: 

T(k,k',t)  =  -k-  lm[B{k,k')}.  (1-6.7) 

This  equation  states  that  the  net  energy  transfer  per  unit  time  from  frequencies 
in  the  vicinity  of  k'  to  those  near  k  is  proportional  to  the  imaginary  part  of  the 
bispectrum  of  the  velocity  field.  Thus,  as  stated  earlier,  the  bispectrum  tells  one 
how  energy  moves  in  frequency  space. 

1-7  Bispectrum  =  Decomposition  of  Cube  of  Time  Series. 

Defining  the  power  spectrum  and  the  bispectrum  as  in  section  1.2,  one 
gets  the  Parseval  relation 

<si2(<)>  =  57/“  (1-7-1) 

which  shows  that  the  power  spectrum  represents  the  contribution  to  the  total 
energy  (or  second  moment)  of  a  particular  frequency  range.  One  also  finds  the 
relation 

.  1  r  00  roc 

( y  (0)  =  /9  /  /  B(u?i,u,2)rfa;i^a;2,  (i_7.2) 

yin)*  J- 00  J-o o 

which  shows  that  the  bispectrum  represents  the  contribution  to  the  (unnormal¬ 
ized)  skewness,  or  third  moment,  of  a  particular  pair  of  frequencies. 

This  result  demonstrates  how  deeply  the  analogy  between  the  second  and 
third  order  quantities  persists  and  modestly  contributes  to  the  reader’s  insight 
into  the  interpretation  of  the  bispectrum.  It  is  difficult  to  attach  much  physical 

3For  completeness,  the  “vector”  bispectrum  in  the  equation  above  is  defined  as 
B(k,k‘)  =  ^  —  J  Rn(x,x')exp(— i[k  ■  x  +  k'  •  x'])dxdx' 


where  the  “vector”  triple  correlation  in  turn  is 


Ll'Ll'l  l'i  a •  |*4 ■  ■ 


or  mathematical  importance  to  this  result  (which  is  why  it  has  been  relegated  to 
the  end  of  this  chapter). 

1-8  The  Bispectrum  Can  Be  Interpreted  in  a  Variety  of  Ways. 

This  introductory  chapter  has  presented  quite  a  number  of  ways  of  re¬ 
garding  the  bispectrum.  It  presents,  in  fact,  a  rather  broad  survey  of  the  bispec- 
tral  literature.  With  luck,  at  least  one  or  two  of  these  perspectives  may  benefit 
the  reader.  Some  of  these  examples  (particularly  the  homogeneous  turbulence 
example)  may  be  overly  technical:  if  so,  ignore  them!  It  is  certainly  true  that 
understanding  these  specific  examples  is  not  necessary  in  order  to  understand  the 
bispectrum. 

The  remainder  of  this  report  is  more  concrete.  The  following  chapter 
presents  some  of  the  fundamental  mathematics  of  the  bispectrum,  including  a 
derivation  of  the  usually  mysterious  “fundamental  domain”.  From  this  point  on, 
all  mathematics  will  either  be  worked  out  in  detail  or,  at  least,  stated  carefully 
and  precisely. 
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CHAPTER  2 


Mathematical  Properties  of  the  Bispectrum 


2—1  The  Principal  Domain  of  a  Bispectrum  is  a  Consequence  of  Sta- 
tionarity,  “Reality”,  and  Symmetries. 

This  section  will  involve  discrete  time  series  solely.  The  Fourier  trans¬ 
form  is  in  this  case  a  continuous  function  of  frequency  which  is  2 it  periodic.  Given 
a  real,  stationary  third-order  correlation  function  c3(n!,n2,n3),  the  principal  do¬ 
main  of  the  bispectrum  is  as  shown  in  Figure  2-1.  This  fundamental  domain 
consists  of  an  isosceles  triangular  subset  and  an  odd  or  extra  triangle. 

The  shape  of  this  principal  domain  is  a  consequence  of  stationarity,  re¬ 
ality,  discrete  time,  and  symmetry  properties.  This  shall  be  worked  out  in  the 
next  few  pages. 

Recall  that  c3  is  defined  as 

C3{ni,n2,n3)  =  ( x(t  +  ni)x(f  +  n2)x(t  +  n3)).  (2-1.1) 

First,  the  discrete  time  nature  of  c3  instructs  one  to  represent  the  Fourier 
transform  as 

1  00 

/s(A|,  A2,  A3)  =  — —  Y,  c3(n,,n2,n3)  exp[-?(A,n,  + A2n2  + A3n3)] 

\^>  ni  ,ni  ,ri3  =  -oo 

(2-1.2) 

where  — 7r  <  At,A2,A3  <  n  [2-1]. 

Second,  one  imposes  stationarity  on  c3;  i.e.,  one  requires  that 

c3(n,,n2,n3)  =  c3(n,  -  n3,n2  -  n3,0)  =  C3(n,  -  n3,n2  -  n  -  3).  (2-1.3) 

Here  the  last  equality  serves  as  definition  of  the  usual  form  for  the  stationary 
bicovariance  function.  For  the  sake  of  completeness,  note  that  this  function  obeys 
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the  symmetries 


C3(niin2)  —  C3(n2)ni)  — 
C3(-n1,n2  -  n,)  =  C3{ni  -  n2, -n2). 


(2-1.4) 


But,  more  relevantly,  this  stationarity  implies  that  /3  is  nonzero  only 


when 


where 


(A,  -4"  A2  T  A3)  mod  27t  —  0. 
One  can  now  write  the  bispectrum  as 

B(o>i,a>2)  =  {X(a?i)X(u;2)X(a>3)}, 
uj3  2?r  n  —  uq  —  cj2, 


(2-1.5) 


(2-1.6) 


(2-1.7) 


and  the  value  of  n  is  restricted  to  0  and  1  when  uq,  u>2,  and  u;3  are  confined  to 
the  range  — tt  to  7r.  (The  value  n  =  -1  is  also  permissible  but  it  gives  redundant 
information  and  will  be  eliminated.) 

At  first  sight,  one  might  expect  that,  at  least  for  some  uq,  u>2,  there  may 
be  two  possible  values  of  u>3  (one  corresponding  to  n  =  0,  the  other  to  n  —  1).  In 
fact,  for  any  specified  uq  and  u>2  there  is  a  unique  n  as  well  as  a  unique  u>3. 

It  is  clear  that  the  value  of  the  bispectrum  is  not  changed  if  any  pair  of 
frequencies  is  interchanged.  This  observation  provides  the  following  symmetries 


B(uq,w2)  =  B(u>2,uq), 

B(uq,u>2)  =  B(w3,uq),  and 
B(uq,u;2)  =  B(w3,  w2), 

which  are  depicted  in  Figure  2-2. 

For  n  =  0,  these  equations  become 

B(oq,  oj2)  =  B(— uq  —  u>2,uq)  and 
B(uq,u>2)  =  B(—  uq  —  u>2,u;2), 

as  shown  in  Figure  2-3. 

For  n  =  1,  these  relationships  are 

B(uq,u>2)  =  B(27r  —  uq  —  u>2,uq)  and 
B(uq,u’2)  =  B(27T  —  uq  —  u>2,  u’2), 


(2-1.8) 


(2-1.9) 


■m 

»  _  »  k  rTuM 


(2-1.10) 

(2-1.11) 
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as  shown  in  Figures  2-4  and  2-5. 

If  the  original  time  series  is  assumed  to  be  real  the  bispectrum  possesses 
the  additional  symmetry 


B(wj,  u>2)  —  [B(—  uq,  — U>2)]*  > 


(2-1.12) 


where  the  asterisk  denotes  complex  conjugation.  This  is  shown  in  Figure  2-6. 

Using  these  symmetries,  a  principal  domain  of  the  bispectrum  can  be 
identified.  From  the  conjugation  symmetry,  it  follows  that  the  regions  of  the  (uq, 
u> 2)  plane  shown  in  Figure  2-6  are  equivalent.  Thus  it  is  only  necessary  to  consider 
the  upper  half  of  the  plane  (positive  u>2).  From  the  second  n  =  0  equation,  the 
two  triangular  regions  shown  in  Figure  2-3  are  equivalent.  Combining  the  second 
n  =  0  equation  with  the  conjugation  symmetry  and  the  uq  <->  uq  symmetry,  it  can 
be  seen  that  the  two  remaining  triangular  regions  are  also  equivalent  so  that  it  is 
only  necessary  to  consider  positive  values  of  both  uq  and  uq.  The  uq  «->  uq  sym¬ 
metry  applied  to  the  positive  (uq,  uq)  quadrant  results  in  the  equivalence  of  the 
two  octants  shown  in  Figure  2-2.  The  first  n  =  1  equation  implies  that  the  two 
triangles  in  Figure  2-4  are  equivalent,  while  the  second  n  =  1  equation  implies 
that  the  two  triangles  in  Figure  2-5  are  equivalent.  No  other  symmetry  relation¬ 
ships  can  be  applied  to  further  eliminate  equivalent  regions  in  the  bispectrum. 
Thus  the  principal  domain  is  the  triangular  region  shown  in  Figure  2-1. 
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For  a  Properly  Sampled  Continuous  Time  Signal,  the  “Extra” 
Triangle  is  “Forbidden”. 


Data  processing  with  digital  computers  necessitates  dealing  with  discrete 
( sampled )  representations  of  signals.  Thus  the  bulk  of  this  report  emphasizes 
the  discrete  time  case.  However,  time  (so  far  as  one  knows)  is  fundamentally 
continuous.  Moreover,  the  continuous  nature  of  time  makes  itself  felt  even  in 
discrete  time  signal  processing.  Specifically,  if  (1)  the  discrete  time  model  arises 
from  sampling  an  underlying  continuous  time  process  and  (2)  this  continuous  time 
process  is  bandlimited  and  (3)  the  sampling  rate  is  sufficient  to  avoid  aliasing, 
then  the  discrete  bispectrum  is  nonzero  only  in  the  isosceles  triangular  subset  of 
the  principal  domain. 

This  result  shall  be  shown  below. 
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A  “bridge”  between  the  discrete  time  bispectrum,  6(Ai,A2,  ''3)1  and  the 
continuous  time  bispectrum,  6(uq,  a'2,  u?3),  is  given  in  Brillinger  and  Rosenblatt 

[2-2]-  _ 

6(Ai,A2,A3)=  ^  b(u>h  <^2?  ^3)  (2-2.1) 

U>1  ,U>2  .^3 

where  Ai  +  A2  +  A3  =  0  (mod27r),  and  uq  +  u;2  -f  lj3  =  0  with  a;,-  =  A,  +  2k  ji  and 
ji  is  an  integer  (  and  i  =  1,2,  or  3).  (In  this  section  ,  the  tilde  is  used  to  denote 
discrete  time.) 

The  bispectra  are  written  with  the  third  argument  explicit  for  simplicity 
of  exposition.  Notice  that  the  arguments  of  6  must  sum  to  zero  whereas  the 
arguments  of  6  can  be  any  integral  multiple  of  2n.  Also  notice  that  this  sum  is 
consistent  with  the  replication  phenomenon  familiar  in  sampled  systems.  That 
is,  the  continuous-time  spectrum  is  replicated  an  infinite  number  of  times  over  a 
three-dimensional  lattice  with  side  length  2k. 

Now  assume  that  the  underlying  system  is  bandlimited.  Further  assume 
that  the  system  has  been  sampled  appropriately  (above  the  Nyquist  frequency). 
Then  k  corresponds  to  the  Nyquist  frequency  or  above  and  one  may  state  that 

6(uq,u;2,u;3)  =  0,  if  |uq|,|u;2|, or  |u;3|  >  7r.  (2-2.2) 

Now  the  argument  is  simple.  Choose  a  point  in  the  ?dd  triangle.  This 
implies  that 

Ai  +  A2  +  A3  =  2k 
and  there  is  no  reason  not  to  take 

—k  <  A,-  <  K. 

AH  uq,u;2,u;3  which  correspond  to  this  set  of  A  are  included  in  the  equation 

0  =  oq  +  a>2  +  u>3  =  A 1  +  A2  +  A3  +  2k  (jl  +  jf2  +  y3),  (2-2.3) 

where  the  ji  are  all  integers  (possibly  zero).  It  is  immediately  clear  that  one  has 

ji  +  j2  +  j3  =  -1-  (2-2.4) 

Thus  at  least  one  of  the  ji  cannot  be  zero.  The  simplest  case  is  ji  =  —  1 
and  J2  and  j3  =  0.  This  leads  to 
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so  uq  must  lie  outside  of  the  Nyquist  range.  But  this  is  a  contradiction  and  shows 
that  a  properly  sampled  signal  cannot  lead  to  a  nonzero  bispectrum  in  the  odd 
triangle. 

Although  this  result  is  not  surprising,  the  relevant  literature  hats  been 
ambivalent  regarding  its  validity.  Part  of  the  reason  for  skepticism  is  the  existence 
of  data  taken  under  conditions  where  aliasing  is  very  unlikely  but  which  shows 
unmistakable  peaks  in  the  forbidden  triangle.  The  author’s  conclusion  is  that  the 
signals  involved  must  have  been  non-siationary ,  and  that  it  would  be  desirable 
to  pursue  an  understanding  of  the  bispectrum  in  the  non-stationary  case.  (The 
only  mention  of  this  extension  in  the  literature  that  the  author  is  aware  of  is  to 
a  problem  involving  speech  recognition.) 

2-3  The  Most  Basic  Mathematical  Properties  of  the  Bispectrum  are 
those  which  Determine  the  Fundamental  Domain. 

The  elementary  properties  of  the  bispectrum,  namely  its  symmetries  and 
its  relation  to  a  stationary  time  series,  are  those  properties  which  determine  the 
fundamental  domain.  As  shown  here,  however,  this  derivation  requires  working 
out  the  combined  implications  of  all  of  these  properties  simultaneously.  The 
treatment  here  is  perhaps  one  of  the  most  explicit  available. 

The  next  section  shall  extend  the  mathematical  study  of  the  bispectrum. 
The  study  will  be  done  in  the  context  of  a  concrete  model  which  can  be  used  to  re¬ 
produce  any  possible  bispectrum.  This  model  should  provide  the  reader  with  both 
insight  and  some  useful  manipulative  techniques  for  studying  the  bispectrum. 


I 


B 


,-VaV. 

yv«vV\ 

1  a  hAk 


& 

Is 


V.\ 

.v.Va 


CHAPTER  3 


The  Universal  Bispectrum  Model 


3-1  The  Universal  Bispectrum  Model  is  the  Simplest  Model  that  can 
Reproduce  any  Possible  Bispectrum. 

This  chapter  thoroughly  explores  a  particular  (discrete)  time  series  model 
which  can  reproduce  the  bispectrum  of  any  possible  time  series.  This  model  is 
not  motivated  by  physical  or  intuitive  considerations  (unfortunately);  however,  it 
does  provide  a  tool  for  learning  useful  mathematics,  for  testing  software,  and  for 
illustrating  and  exploring  the  properties  of  the  bispectrum. 

It  may  be  wise  to  start  with  the  corresponding  model  for  the  power 
spectrum.  Statisticians  are  very  familiar  with  the  time  series  model 

OO 

x(t)  =  e(t)  +  ]T  g{m)e(t  -  m).  (3-1.1) 

m=0 

This  model  represents  an  (infinite  order)  moving  average  (MA)  process.  Here 
e(t)  is  a  random  variable  (taken  to  be  normally  distributed,  for  simplicity)  and 
values  of  e  at  different  times  are  statistically  independent.  In  standard  notation, 
the  e(t)  =  N(0,  a2)  and  are  “independently,  identically  distributed”  (i.i.d.).  The 
e(t)  can  be  regarded  as  random  “inputs”  (or  innovations).  The  coefficients  (or 
weights)  g(m)  can  be  chosen  such  that  the  time  series  x(t)  has  any  desired  power 
spectrum.  So,  this  model  may  be  considered  to  be  a  universal  power  spectrum 
model.  (There  are  other  models  with  this  property.) 

The  universal  bispectrum  model  is  a  simple  generalization  of  the  above. 
A  convenient  form  is 

PC 

x(t)  =  c(f)  +  r]{t)  +  ^2  g(m,n)e(t  -  m)g(t  -  m  -  n),  (3-1.2) 

rn~0 

where  now  the  model  contains  two  independent  random  processes.  That  is,  for 
all  values  of  t,  t(t)  and  g(t)  =  N(0,er2)  are  i.i.d.  It  may  suffice  for  now  to  think  of 
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the  universal  bispectrum  model  as  a  sort  of  infinite  MA  model.  The  g(m,n)c(t) 
can  be  regarded  as  random  MA  weights  and  the  g(t)  can  be  regarded  as  the 
innovations.  The  model  is  completely  specified  by  a  and  the  weights  g.  There 
are  other  models  which  can  produce  any  possible  bispectrum,  but  the  above  is 
the  simplest  to  use. 

In  an  initial  reading,  the  remainder  of  this  chapter  can  be  omitted.  Its 
primary  purpose  is  to  document  the  universal  bispectrum  model  for  future  use. 
Necessarily  it  is  very  mathematical.  However  the  mathematics  have  been  spelled 
out  in  perhaps  painful  detail  so  as  to  be  appropriate  for  an  introduction  to  the  bi¬ 
spectrum.  The  hope  is  that  this  chapter  can  serve  to  make  the  earlier  discussions 
more  concrete  and  thus  enhance  the  reader’s  ability  to  deal  with  the  bispectrum 
productively.  The  remaining  sections  of  this  chapter  are  organized  as  follows: 
the  time  domain  properties  of  this  model  -  the  second-order  (3.2)  and  third- 
order)  3.3)  cumulants  are  derived;  these  functions  are  then  Fourier  transformed 
to  get  the  power  spectrum  (3.4)  and  the  bispectrum  (3.5);  the  inverse  problem 
is  examined  (3.6);  the  model  is  reformulated  and  studied  in  the  frequency  do- 
main(3.7,8);  and  finally,  speculations  on  ways  in  which  this  model  may  arise  in 
practice  (3.9)  are  presented. 

3-2  The  Second-order  Cumulant  for  the  Universal  Bispectrum  Model 
is  Rather  Messy  and  Not  Very  Informative. 

First  on  the  agenda  is  to  obtain  the  second-order  cumulant  function: 
c2(r )  =  (  x(t)  x(t  +  r) ) 

=  1  -  igt)  +  y]  g(m,  n)t(t  —  m)rj{t  —  m  —  n)\ 

m.n 

K-  F  r"  +r?(t  +  r) 

+  ^  g(m',  n)  e(t  +  r  —  m')  g(t  +  r  —  m1  —  n')  ])  (3-2.1) 

m'.n' 

=  (t(t)e(t  +  r)  +  ri(t)ri{t  +  r) 

+  J2  g(m,n)g(m',n') 

m.m'.n.n' 

e(t  —  m)  c(<  +  r  —  m')  g(t  —  m  —  n)  rj{t  +  r  —  m!  —  n') ). 

So  far  we  have  just  substituted  for  x  and  used  the  fact  that  t  and  r/  are  independent 
variables.  Next  we  shall  use  the  fact  that  ((t.)  is  independent  of  all  e(t')  for  t'  ^  t 
(and  similarly  for  r/).  The  expectation  of  the  e(t)e(t  +  7-)  term  becomes  a26r  (and 
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so  does  the  corresponding  term  for  g).  Here  6r  is  the  Kronecker  delta  function 
which  is  zero  except  when  its  argument  (r)  has  the  value  zero,  in  which  case  it 
is  one.  The  expectation  of  the  complicated  summation  also  simplifies  in  so  far 
as  only  those  terms  for  which  m'  =  m  +  r  and  n'  =  n  contribute.  So  the  final 
expression  becomes 


(3-2.2) 


This  expression  is  included  more  for  completeness  than  for  its  usefulness.  Note 
that  this  formula  does  have  the  r  «-»  —  r  symmetry  of  c2(r),  but  that  this  symmetry 
is  not  blatantly  on  display  here. 


OO 

c2(r)  =  2cr2<5r  +  (X4  ^2  g(m,n)g(m  +  r,n). 

m,n= 0 


3-3  For  the  Universal  Bispectrum  Model,  the  Third-Order  Cumulant 
Equals  <r4  times  the  Weight. 

The  third-order  cumulant  function  is  evaluated  in  a  similar  fashion  (but 
it  turns  out  nicer): 

c3(r,s)  =  (x(t)x(t  +  r)x(t  +  s)))  = 


([e(0  +  T)(t)  +  g(m,n)e(t  -  m)g(t  -  m  -  n)]  (3-3.1) 

m.n 

A  B  C 

[e(t  +  r)  -f  T](t  +  r)  -f  ^  g{m\  n')e(t  +  r  -  m')g(i  -f  r  —  m'  -  n')] 

m*  ,n* 

i  »  t*t 

[e(t  +  s)  +  g(t  +  s)  +  ^  g(m",n")e(t  +  s  —  m")r](t  +  s  —  m"  —  n")j). 

m",n" 

The  above  product  contains  27  factors.  However  the  single  rule  that  only  terms 
with  an  even  number  of  factors  of  t’s  and  r/’s  will  have  a  non-vanishing  expectation 
eliminates  all  but  six  terms.  These  six  terms  all  involve  a  (double)sum  of  a  product 
of  a  <7,  two  e’s,  and  two  7/’s.  In  order  for  these  terms  to  contribute,  the  arguments 
of  the  two  c’s  must  be  at  the  same  time  and  similarly  for  the  t/’s.  This  reduces 
each  of  the  six  double  summations  to  a  single  term.  These  six  terms  are  shown 
below  using  the  notation  source  —*  target.  The  source  specifies  the  particular 
combination  of  three  factors  which  leads  to  a  non-vanishing  expectation  and  the 
target  denotes  the  weight  g  involved  in  the  surviving  combination. 


■  Xtf*  i-V  *  i 


,^vi  *♦»< 


c3(r,«) 


=  {  x(t)x(t  +  r)x(t  -f  s))) 

=  (  iBiii  —  ■*  #(s,  -r) 

+  lCii  — >  g(r,  —5) 

+  2 Ain  — >  g(s  —  r,r) 

+  2Ci  — >  g(r  -  s,s) 

+  3Aii  — ►  g{—r,r  —  s) 

+  3 Bi  — >  g(— s,  s  —  r)  ). 


(3-3.2) 


The  two  e’s  and  r/’s  in  these  terms  contribute  a  factor  of  <r4  to  the  expectation 
above,  so  the  final  result  is 


c3(r,s)  =  <r4  {  g(s,  -r) 

+  9(r,  ~s) 

+  g(s  —  r,  r) 

+  g{r-s,s) 

+  9(-r,r-s)  }. 


(3-3.3) 


This  equation  is  already  much  nicer  than  equation  (3-2.2),  but  it  is  even 
better  than  it  appears.  For  a  specific  choice  of  r  and  s,  in  general  only  one  term 
will  be  non-zero  in  the  above  sum.  For  example,  if  r  and  s  are  both  positive, 
and  if  r  >  s,  then  only  the  g(r  —  s,  s)  term  is  non-zero.  Causality  (the  fact  that 
g(r ,  s)  =  0  if  r  <  0  or  s  <  0  )  reduces  the  above  sum  to  one  term.  Moreover,  the 
remaining  terms  are  just  those  that  are  needed  for  c3(r,  s )  to  obey  the  symmetry 
requirements  of  a  third-order  correlation  function.  So,  in  essence,  the  third-order 
correlation  function  c3(r,  s)  is  equal  to  g(r  —  s,s),  except  for  a  factor  of  cr4,  and 
unless  g(r  —  s,s)  =  0,  in  which  case  one  must  take  the  appropriate  non-zero 
image . 

However,  things  are  slightly  messier  than  implied  above.  If  either  r  or  s 
is  equal  to  zero,  or  if  r  and  s  are  equal  to  one  another,  then  several  terms  in  the 
above  sum  may  simultaneously  be  non-zero,  so  that  c3(r,  s)  and  g(r  —  s,s)  differ 
by  a  small  factor  in  addition. 
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The  exact  relation,  which  is  equivalent  to  equation  (3-3.3),  is 


65(0,0) 

2(7(r,0) 


r  =  0,  s  =  0 
r  >  0,  s  =  0 


(3-3.4) 


g{r  —  s,  s)  r  >  s,  s  >  0  l±J 

2g(0,  r)  r  =  5,  s  >  0  CD 

g(s  —  r,r)  r  >  0,s  >  r  0 

25(0,s)  r  =  0, 3  >  0  0 

c3(ri s)  =  '  g(s,—r)  r  <  0, s  >  0  0.  (3-3.4) 

25(0,  -r)  r<  0,5  =  0  0 

g(—s,s  —  r)  r<s,s<  0  0 

2g(—  r,0)  r  =  s,s<0  0 

g(—r,r  —  s)  r  <  0,s  <  r  00 
25(0,  -s)  r  =  0,s<0  00 
g(r,—s)  r  >  0, s  <  0  10 

The  boxed  numbers  ranging  from  0  to  12  refer  to  Figure  3-1. 

This  figure  presents  the  r,s  plane  and  sb^ws  how  each  term  above  con¬ 
tributes.  Each  region  assigned  an  odd  number  corresponds  to  the  line  which  is 
the  boundary  between  the  two  adjacent  wedges.  Each  region  assigned  an  even 
number  represents  one  of  the  six  primary  regions  above  (except  for  the  region 
assigned  the  number  zero,  which  represents  the  origin  only).  Thus,  for  example, 

in  region  0,  03(7-,  s)  is  equal  to  g(s,  — r). 

3-4  The  Power  Spectrum  of  the  Universal  Bispectrum  Model  is  Some¬ 
what  Informative. 

The  power  spectrum  is  defined  as  the  Fourier  transform  of  the  second- 
order  cumulant  function,  so  that 

/2(A)  =  £  c2(r)  e~'Xr 

r— -00 

00  00 

=  X  [2o-26r  +  a1  X5(m*n)5(77i  +  r,n)]  e_lAr  (3-4.1) 
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Figure  3-1  The  c3  plane  in  terms  of  the  weights  g. 

oo  oo 

=  2<r2  +  Y,  g(m  +  r,  n)  e~,Xr 

m.n  r— -oo 

oo  oo 

=  2t2  +  <74  5;  <,(">,„)  £  g(s,„) 


Therefore,  if  one  defines  =  Y  9{k in)  e  then 

k—-o o 


(3-4.2) 


/2(A)  =  2<rJ  +  o"1  jr  |j„(A)|2. 


(3-4.3) 


This  result  is  somewhat  helpful  in  limiting  circumstances.  In  particular, 
if  the  weights  are  small  in  comparison  with  <r,  the  power  spectrum  will  be  ap¬ 
proximately  flat  (independent  of  frequency)  with  magnitude  2cr2.  In  the  opposite 
limit,  the  power  spectrum  will  vary  as  the  magnitude  of  a  typical  g  squared. 

This  result  will  also  be  useful  later  in  determining  how  to  trade  off  the 
parameters  (cr  and  the  g's)  to  yield  the  model  with  smallest  variance  for  a  specified 
bispectrum. 
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3-5  The  Bispectrum  of  the  Universal  Bispectrum  Model  is  Given  by 
the  Frequency  Coupling  Coefficients. 

Remember  that  the  bispectrum  is  the  double  Fourier  transform  of  the 
third-order  correlation  function: 


OO 

b{\ !,A2)=  £  c3(r,s)e-i(Air+A^). 


(3-5.1) 


r.s-—oc 


The  following  definition  is  needed  in  order  to  re-express  6(Aj,  A2)  in  terms  of  the 
weights: 

I  __  1 

(3-5.2) 

The  g  will  be  referred  to  as  frequency  coupling  constants  for  reasons  to  be  made 
clear  in  section  3.6.  Substitute  for  c$(r,  s)  using  equation  (3-3.3).  The  result  is 
a  sum  of  six  terms.  The  detailed  calculation  for  the  term  involving  g{r  -  s,s)) 
follows.  The  contribution  of  this  term  to  the  entire  bispectrum  will  be  denoted 

because  only  g(r  -  s,  s)  is  non-zero  in  region  HI;  thus 

OO 

A2)  =  <T4  9(r  -  s,s)  exp[-i(rAi  +  sA2)] 

—I  r,3— -00 

OO 


=  (74  jT  g{r',s)  exp[— ^(r'Aj  +  s(Aj  +  A2))] 


r',«=- 00 


Similarly, 


=  ^1  +  A2). 


^At,  A2)  =  CT4$(^2,^l  +  ^2 

6rgi(A1,A-2)  =  cr4g(A2, -AO 


(3-5.3) 


^Tg^(A  1,  A2)  =  cr4^(-Ai  —  A2, -At) 
A(,  A2)  =  cr4g(-At  -  \2, -A2) 


^Y2l(A1,A2)  =  er4g(X[,  —  A2). 


The  final  result  is  just  the  sum  of  these  six  terms: 


(3-5.4) 


1  1/ 


+  5(^2^ —^i)  +  ^(Ai, —A2) 

+  g(— Aj  —  A2,  —  Aj)  +  g(— Ai  —  A2,  —  A2)].  (3-5.5) 

(The  reader  may  have  some  doubt  as  to  whether  this  result  takes  care 
of  the  “boundary”  regions  minm  etc.,  properly.  Rest  assured  that  it  does. 
Notice  that  includes  one-sixth  of  the  contribution  from  0,  and  one-half  of 

the  contribution  from  0  and  0,  so  that  the  sum  of  all  six  terms  does  in  fact 
correctly  include  the  regions  which  do  not  explicitly  appear  above.) 

As  was  the  case  for  the  corresponding  time  domain  result,  this  formula 
is  much  simpler  than  that  for  the  second-order  quantity.  Further,  it  is  simpler 
than  it  appears.  Originally,  the  author  used  the  above  equation  (3-5.5)  to  base 
a  presentation  of  the  “inverse”  problem  of  constructing  the  time  domain  weights 
due  to  a  particular  bispectrum.  The  author  currently  favors  the  approach  shown 
in  the  next  section  so  that  equation  (3-5.5)  is  no  longer  so  important  to  the 
development.  Yet  it  is  still  of  interest  to  solve  the  above  equation  for  g  given  b 
and  the  next  section  shall  show  how  this  can  be  done. 

3-6  Inversion  of  the  Model  is  Easy. 

This  section  will  show  how  to  go  backwards.  That  is,  given  a  bispectrum, 
the  corresponding  model  coefficients  will  be  determined.  For  simplicity,  it  shall  be 
assumed  that  the  desired  bispectrum  lies  entirely  in  the  interior  of  the  isosceles 
triangular  subset  of  the  principal  domain.  This  assumption  makes  exposition 
somewhat  easier,  but  it  is  by  no  means  difficult  to  redo  this  section  for  the  general 
case. 

Given  a  bispectrum  6  as  assumed  above,  define  &I(uq,u;2)  to  be  a  2tt 
periodic  (in  uq,u>2),  conjugation  symmetric  function  which  agrees  with  b  in  the 
fundamental  domain  but  is  zero  elsewhere.  One  can  call  £q  the  signature  of  the 
bispectrum.  The  signature  is  b  stripped  of  all  the  symmetries  of  b  except  for  the 
periodicity  and  the  conjugation  symmetry.  In  terms  of  &j,  b  can  be  written 


6(uq,uq)  = 


&l(wl>w2) 

-f  &i(u;2,uq) 

+  M^i,  — uq  —  u>2) 

+  6i(w2,  —U>[  —  U>2) 

+  M  — wi  —  ui2,  t<q) 

+  6i(— Wi  —  w2,  u>2) 
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This  expression  shows  that  b  is  a  symmetrized  version  of  its  signature 
and  has  ali  the  correct  bispectral  symmetries.  Thus  bt  is,  as  claimed,  b  “de- 
symmetrized” .  The  author  derived  this  equation  by  starting  with  the  hi(u>i,u>2) 
piece  and  applying  the  bispectral  symmetries  to  get  the  complete  bispectrum.  The 
assumption  that  b  is  non-zero  in  only  the  interior  of  the  fundamental  domain  has 
already  been  used,  for  otherwise  the  above  sum  would  overcount  contributions  on 
the  boundary.  This  overcounting  problem  is,  in  fact,  the  reason  for  the  simplifying 
assumption. 

Assume  next  that  the  inverse  double  Fourier  transform  of  b i  together 
with  the  conjugation-symmetric  part  of  b[  is  known.  This  double  Fourier  trans¬ 
form  is  real  and  it  is  defined  by 

r,s)  =  j  J  hi(u>i,u>2)  exp[i(ru;i  +  su>2)]du  idu>2  +  c.c.  (3-6.1) 

where  the  integration  limits  are  —n  to  7r, 

It  is  a  routine  matter  to  verify  the  following  table  of  transforms: 


=»  h(r,s) 

2iWj)  =*•  h(s,r) 

Mwj,  — ah  —  w2)  =>•  h(r  —  s,—s) 

&i(w2,  —  uq  —  u>2)  =>  h(s-r,—r) 

bl(-ojl  —  u2,uji)  =>  h(-s,r  —  s) 
b i(—u i  —  uj2,  u2)  =>  h(—r,  s  —  r),  (3-6.2) 


where  x  =>  y  symbolizes  the  fact  that  y  is  the  inverse  double  Fourier  transform 
of  x. 

Combining  equations  (3-6.1),  (3-6.3),  and  the  fact  that  the  bispectrum 
is  the  double  Fourier  transform  of  the  triple  correlation  gives 


c3(r)5)  =  h(r,s)  +  h(s,r)  +  h(r  —  s,  —  s)  +  h(s  —  r,  —  r)  +  h(  —  s,r  —  s)  +  h(- r,  s  —  r). 

(3-6.3) 

This  is  a  key  result  in  the  inversion  of  the  bispectrum:  it  expresses  the 
triple  correlation  in  terms  of  the  transform  of  the  bispectral  signature.  This  ex¬ 
pression  displays  explicitly  all  the  appropriate  symmetries  of  a  triple  correlation 
function  so  that  h  like  the  g  earlier  is  a  de-symmetrized  version  of  the  triple  cor¬ 
relation.  All  of  this  so  far  in  this  section  is  entirely  independent  of  the  universal 


bispectrum  model.  But  now,  using  the  universal  bispectrum  model,  one  is  essen¬ 
tially  done.  For  given  the  triple  correlation,  there  is  a  unique  corresponding  g. 
This  g  can  be  obtained  by  inverting  (3-3.4)  to  get 


f  i  c3(0,0)  r  =  s  =  0 

1  I  I  c3(r,0)  r  >  0,  s  =  0 


a(r  s)  -  —  J  2  w'"/  '  '  ~ 

9  ’  <r4  |  f  c3(0,s)  r  =  0,  s  >  0 


(3-6.4) 


c3(r  +  s,s)  r,  s  >  0 


Actually  the  universal  bispectrum  model  permits  an  even  more  direct 
solution  to  the  inverse  problem.  It  is  not  necessary  to  introduce  the  signature  at 
all.  One  just  takes  the  above  equation  and  rewrites  it  as 

d(r,s)  =  -j{c3(r  +  s,s)[u(r)u(s)  -  ^  u{r)S(s)  -  ^  6(r)u(s)  -f  ~  S(r)S(s)J) 


=  —  4  (r,s)m(r,s). 


(3-6.5) 


In  the  first  line  above,  u(r)  is  the  step  function  (u(r)  =  1  for  r  >  0  and 
u(r)  =  0  otherwise)  and  6(r)  is  the  Kronecker  delta  function.  In  the  second  line 
c'3(r,  s)  =  c3(r  +  s,  s)  and  m(r,  s)  is  the  sum  of  products  of  6  and  u  bracketed 
above. 

If  one  double  Fourier  transforms  both  sides  of  the  above  equation,  then 
the  left  side  will  give  the  g  s  and  the  right  side  will  be  the  convolution  of  the  given 
bispectrum  with  the  transform  of  m.  Explicitly,  one  can  write 


<H4,A2)  =  —  [c'3*m(A1,A2)] 


(3-6.6) 


with  c'3(A„  A2)  =  b{\ ,,  A2  -  A,). 


The  transforms  of  the  step  function  and  the  delta  function  are  simple  to 
get;  they  are 


1  —  exp(  —  iX) 


=  S  ^(A  —  2rm)  + 


-  cot  (A/2) 


(3-6.7; 


*  )fwv*  UDHTH  V"*.yTVT  VTTw^VI,VLTr.^a^^VL^7V\^W1  VWY.Vl  V3  vi  v  v  ^  vi  r*  v<  v  \ 

'  ‘ 


6(  A)  =  1,  (3-6.8) 

and  m  may  readily  be  obtained  by  appropriately  combining  the  above  results. 

Writing  the  above  convolution  somewhat  more  explicitly  gives 
1  *  * 

5r(Ai,  A2)  =  —  J  J  f>(At  -  w1?  A2  -  At  -  u>2)  m(uq,u:2)  duv  du2.  (3-6.9) 

—  7T  —K 

The  above  formula  expresses  the  g's  directly  in  terms  of  the  given  bi¬ 
spectrum.  It  is  thus  the  inverse  of  the  relation  obtained  at  the  end  of  the  previous 
section.  The  expression  above  is  not  suited  to  numerical  computations,  however, 
due  to  the  singularity  of  the  cotangent  functions  at  u>  =  0.  Presumably  it  may 
be  simplified  using  principal  value  techniques  by  analogy  to  the  continuous  time 
situation. 

One  could  now  go  on  to  recover  the  g's  from  the  g's  by  an  inverse  double 
transform.  This  provides  a  logically  straightforward  direct  route  from  the  desired 
bispectrum  to  the  model  coefficients.  Numerically,  however,  it  is  simpler  merely 
to  compute  c3  via  the  double  transform  and  then  apply  equation  (3-6.5).  The 
indirect  route  presented  first  is  just  an  attempt  to  explicitly  account  for  the 
symmetries  of  the  bispectrum  after  doing  the  double  transform. 

3-7  The  Universal  Bispectrum  Model  in  the  Frequency  Domain  is  as 
Simple  as  it  Can  Possibly  Be. 

The  preceding  sections  have  formulated  the  universal  bispectrum  model 
in  the  time  domain  (3.1),  computed  its  time  domain  properties,  first  (3.2)  and 
second  (3.3)  order  cumulants,  and  then  transformed  these  quantities  into  the 
frequency  domain  to  find  its  power  spectrum  (3.4)  and  its  bispectrum  (3.5).  An 
alternate  technique  is  to  formulate  the  model  in  the  frequency  domain  and  then 
to  compute  its  power  spectrum  and  its  bispectrum  directly.  This  approach  will 
be  taken  in  the  present  section.  Here  the  “frequency  coupling  coefficients”  take 
a  more  central  role. 

Start  with  the  time  domain  version  of  the  model,  equation  (3-1.2): 


x(t)  =  e(t)  +g(t)  +  ^2  g(m.n)e(t  -  m)r](t  -  m  -  n). 

rn.n- 0 

Express  x,  £,  g  in  terms  of  their  Fourier  components: 


:(0  =  I  e'“Vu.' 

2tt  J 2 it 


(3-7.1) 


(3-7.2) 
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Figure  3-2  Integration  limits  before  change  of  variables. 

(with  similar  expressions  for  c  and  i)). 

Therefore, 

J  x(u>)  e'utdu:  =  J  i(n)  e'^dfi  +  J  t?(u>)  e'^du  (3-7.3) 

+  V  g(m,  n)-—  /  e(/i)  [  g(v)  eil,{t-m~n)dv. 

m  n  2tt  J  J 


Continuing,  one  gets 

J[x(u)  -  e{u)  -  t){uj)}  e'^du  =  ^  (3-7.4) 

“  m  ,n— 0 

J  J  e{n)fj{u)  e-i^d^dv 

Now  is  a  good  time  to  change  variables  from  v  to  .  =  //  +  t/,  to  The 
Jacobian  of  this  transformation  is  one  (1),  conveniently!.  What  about  integration 
limits?  The  original  region  of  integration  in  the  p,i/  plane  can  be  chosen  as  the 
square  shown  in  Figure  3-2. 

The  new  region  is  then  the  rhombus  (Figure  3-3). 
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Figure  3-3  Integration  limits  after  change  of  variables. 


Continuing, 


f  -  c(u)  -  fj(ij)}  eiutdu  = 

~  J  f  e(u}  —  v)f)(i>)  )T  p(n,n)  e“,(wTO+'/n)  eluidvd.Lu.  (3-7.5) 

m,n=rO 

So  one  is  led  to  re-introduce  the  frequency  coupling  coefficients 

OO 

$(w,i/)=  £  g{m,  n)e~^um+l/nK  (3-7.6) 

m,n=0 

Therefore, 

J [r(u;)  —  e(w)  —  r)(u;)]  etwtdu>  =  —  J  J  g(u>,v)e( u>  —  elutdi/du>.  (3-7.7) 
Further, 

j [t(u>)  —  e(u’)  —  t)(u>)]  elutdu>  =  —  J  J  g(Lj,u)e(uj  —  i')7/(r')  e,uitdvdu.  (3-7.8) 

Currently  the  integration  domain  is  the  rhombus.  This  can  be  replaced 
with  the  more  convenient  (original)  square  in  the  oj,v  plane  by  the  following 
argument.  Consider  the  triangular  segment  of  the  domain  shown  in  Figure  3-4. 
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Figure  3-4  Migrating  triangular  piece  (before  migration) 
This  piece  contributes 

J  J  g(u),  i/)i(u  -  e'utdvdu). 

Now  change  variables:  v  — *  is,  u>  — ►  u/  =  u>  —  2tt.  One  gets 

J  J  g(u>'  +  27t,  +  27r  —  e‘^w  +2*^dvdu>' . 

Using 

g(u,i/)  =  g(u>  ±  2ir,  v)  ~  g(u>,  u  ±  2n) 

i I u)  =  t[u>  ±  27r), 


(3-7.9) 


(3-7.10) 


this  becomes 


J  J  g(u,  v)e(uJ  —  i>)f)(u)  c'^di/dui. 


(3-7.11) 


(3-7.12) 


The  triangle  /? ,  has  been  moved  over  to  become  the  triangle  R\  shown 
in  Figure  3-5.  In  this  fashion  the  region  of  integration  can  be  restored  to  the 
original  square. 

The  frequency  domain  equation  becomes 


f (u>)  -f/M]  ciutdu)  =  j |  —  j 


uj  —  v)g{ v)elutdv |  cliJtduJ, 
(3-7.13) 
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Figure  3-5  Migrating  triangular  piece  (after  migration). 


so  that  the  model  is 


(3-7.14) 


Equivalently,  in  Cramer  notation  (described  more  fully  in  the  next  sec¬ 
tion), 


(3-7.15) 


3-8  The  Bispectrum  Can  Be  Computed  Directly  in  the  Frequency 
Domain. 

It  is  now  a  simple  calculation  to  get  the  bispectrum  directly.  (The  calcu¬ 
lation  of  the  power  spectrum  is  left  as  “an  exercise  for  the  reader”.)  Cramer  nota¬ 
tion  makes  the  calculation  much  less  messy.  MacDonald’s  paper  [1.4-2]  presents 
a  good  discussion  of  the  physical  interpretation  of  the  Cramer  representation. 

Let  x(t)  be  a  real,  stationary,  discrete  time  stochastic  time  series.  Then 


its  Cramer  representation  is 


*(<)  =  h 


(3-8.1) 


Useful  properties  are 

dx(—u>)  =  dx(u>) 

and  { dx(u>i)dx(u>2 ))  =  8'(uji  -f  u2)  f i)duj  idu)2 
where  c2(t)  =  ^  J  etuT f2{u>)du>. 

—  7T 

For  the  independent  Gaussian  inputs  ( innovations ),  one  finds 

(de(uji)di(uj2))  —  <j26'(lj1  -{-  uj2)dui[duj-1, 

(di){<jj  {)dr)(<jj2))  =  cr2<5,(tt>i  +  u:2)dujiduj2, 

{de(ioi)dfi(u)2))  =  0. 


(3-8.2) 


(3-8.3) 


Here  we  use  the  notation  6'(x)  =  Y1  8(x-\-2wk).  This  is  the  2tt  periodic 

k  —  —  oo 

extension  of  the  Dirac  delta  function  that  Brillinger  calls  the  “Dirac  comb”  and 
denotes  r](x). 

MacDonald  presents  the  following  formula  for  the  bispectrum  using  the 
Cramer  representation: 


B(/i ,/2)  dh  df2  S(fuf2,fz)  =  E{di(/i),dx(/2),dx(/3)}. 


(3-8.4) 


This  formula  must  be  put  in  terms  of  radian  frequency  uj  as  opposed 
to  angular  frequency  /.  This  means  one  must  distinguish  between  the  function 
dx/(f)  and  dxw(u>).  The  previous  dx’s  are  actually  dxj s.  The  Cramer  represen¬ 
tation  for  x  in  terms  of  angular  frequency  is 


x(t)  =  Jy^d^f). 


and  this  implies  that  (w  =  27r /) 


dif(f)  =  dxu(uj). 


Further,  for  all  positive  a 


<5(ax)  =  —  <5(x). 
a 


(3-8.5) 


(3-8.6) 


(3-8.7) 
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Using  these  results,  MacDonald  s  equation  becomes 
B(u>!, UJ2)  du>  1  du>2  6(u)l,u>2,uj3)  =  27rE{dic(u>1),  dx(u>2),  di(u;3)}. 
Thus  one  needs  to  compute 


( dx(ul)dx(u)2)dx(u>3 )} 


=  +  dtiiuji)  +  ^  J  9{v !,  1 ')di(uJi  -  i')fj(t')} 


[  r* 

{di(u>2)  +  dfi(oJ2)  +  —  y_  £(a>2,  iy)de(u>2  -  *>)] 


[de(w3)  +  d7?(u3)  +  —  g{u>3,  v)de{u>3  - 

=  (  IBiii  +  2 Am  +  3 An  +  +  1C it  +  2Ci) 

7T 

=  —  / ^  ^(u;3,j/")de(a;i)dc(u;3  -  J/")d7)(o;2)d77(j/") 

—  7T 

+  fif(cj3,  i/")dc(w2)dc(^3  -  u")dfi{ui)dfi(u”) 

+  g(u>l,v)de(u2)d(i(u>i  -  v)dr){uj3)dr){v) 

+  g(uu  1 v)dt(oj3)dt(u)l  -  u)dr}{oj2)dr]{v) 

+  ^/)</e(c*?i)</e(c^2  -  ^')dg{oJ3)dfi{v')^ .  (3-8.9) 

There  are  six  expectations  above:  the  first  one  will  be  worked  out  fully. 
The  remaining  expectations  involve  nothing  new. 

(^r  /  y(UJ3'l/"^  dH^i)dH^3  ~  v")  dr)(u>2)dfi(v ")j 

—  7T 

1  IT 

=  2tt  / 
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=  —  f  g(u>3,  v")  (de(u)i)d(.(u>3  —  u"))  <t26\lj2  +  v")du2dv" 

27r  J 


=  7T  9(^3,  -w2)  ( de(u>i)de(u)3  +  u;2))  cLo 2 

Z7T 

=  —  #(w3,  — u>2)  6'(uq  +  w2  +  ^3)  du  idu2du>3. 


(3-8.10) 


Here  cj2  has  been  chosen  such  that  —  it  <  o;2  <  ir.  Due  to  the  integration 
limits  only  the  6'  term  shown  in  the  fourth  line  above  contributes.  Further,  the 
final  (fifth)  line  is  a  consequence  of  V(i),d2;r  =  0.  Substitute  this  result  (and  the 
corresponding  results  for  the  remaining  terms)  into  the  complete  equation  to  find 

(dx(ujl)dx(u>2)dx(  u>3))  = 

<t4/(2t)  {5(^3,  — o>2)  +  g{u)3,  ~ ^1)  +  <j{u i>— ^3) 

+  9(^11  ~u>  2)  +  9{<^2i  —U3)  +  g(u>2, —LUi)} 

-)-  ui2  -(-  li 3)  duqdcj2du>3.  (3—8.11) 

Comparing  this  expression  with  MacDonald’s  equation  (in  radian  fre¬ 
quency)  one  recovers  the  formula  for  the  bispectrum  in  terms  of  the  frequency 
coupling  coefficients 

B(uq,u;2)  =  <t4  {  g(w3,  —  u>2)  +  p(u;3,  —  uq)  +  g(uh  —  W3) 

+  $(w„- u;2)  +  g(u2, -w3)  +  $r(u>2, -uq)  }.  (3-8.12) 

3-9  The  Universal  Bispectrum  Model  Might  Arise  in  Practice,  Al¬ 
most. 

The  universal  bispectrum  model  was  written  in  Section  3-1  with  little 
in  the  way  of  motivation.  The  unfortunate  truth  is  that  the  model  was  developed 
on  mathematical  grounds  and  there  is  no  known  physically  realistic  system  to 
which  it  might  apply.  This  section  shows  how  the  obvious  attempts  to  construct 
the  universal  bispectrum  model  come  close  but  do  not  quite  succeed. 

One  can  write  the  general  infinite-order  moving  average  model  as 


x(t)  =  e(t)  +  9(nMt  ~  n). 


(3-9.1) 


just  as  in  Section  3.1. 
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Here  e(i)  is  a  random  variable  (taken  to  be  normally  distributed,  for 
simplicity)  a,nd  values  of  e  at  different  times  are  statistically  independent. 

Next  assume  that  the  weights  g(n)  themselves  are  stochastic  functions 
of  an  independent  time-dependent  random  variable  77: 


g(m)  =  -  n-m). 


m—0 


Combining  the  above  two  equation  gives 


x(t)  =  e(t)  +  5Z  bnmV{t  ~n~  m)e(t  -  n). 

n=0  m—0 


(3-9.2) 


(3-9.3) 


The  above  equation  is  almost  the  universal  bispectrum  model.  There  is  a  term  77(f) 
missing.  This  flaw  seems  to  afflict  any  effort  to  derive  the  universal  bispectrum 
model  in  a  natural  fashion.  “Derivation”  of  the  model  in  a  natural  fashion  is 
another  opportunity  for  the  reader! 

It  is  possible  to  rewrite  the  model  above  in  an  equivalent  auto- regressive 
form.  If  equation  (3-9.1)  is  written  as 


x(t)  =  e(t)  -f  ^2  o{n)x(t  —  n) 


(3-9.4) 


n— 0 


with  auto-regressive  coefficients  a(n)  chosen  to  yield  the  same  time  series  as  given 
by  the  g{n)  and  if  c„m  are  chosen  to  relate  to  the  a(n)  as  the  bnm  relate  to  the 
g(n)  then  equation  (3-9.3)  becomes 


x(t)  =  e(f)  +  J2  x(l  ~  n)  H  cnmT){t  -n-m). 


(3-9.5) 


n=0 


m—0 


This  model  suffers  the  same  defect  as  (3-9.3),  but  perhaps  may  suggest  a  defect- 
free  model. 

3-10  The  Universal  Bispectrum  Model  Can  Be  Clarified  by  a  Dia¬ 
gram. 

The  primary  concern  of  this  chapter  has  been  to  determine  how  to  com¬ 
pute  a  time  series  which  corresponds  to  a  given  bispectrum.  A  physicist  might  call 
this  the  inverse  problem  or  a  statistician  may  call  this  the  identification  problem. 
A  specific  model  which  makes  the  solution  of  this  problem  possible  (and  easy) 
has  been  introduced  and  termed  the  universal  bispectrum  model.  One  method 
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Figure  3-6  Commutative  diagram  for  Universal  Bispectrum  Model. 

of  using  this  model  is  to  desymmetrize  the  bispectrum  to  get  its  signature,  fej, 
double  Fourier  transform  the  signature  to  get  h,  then  resymmetrize  to  get  c3,  and 
then  desymmetrize  to  get  g.  The  g's  then  tell  one  how  to  compute  a  time  series. 
A  second  method  would  reconstruct  the  g's  from  the  original  bispectrum,  and 
then  double  Fourier  transform  to  get  the  g's  and  then  get  the  time  series.  The 
various  relationships  and  possible  pathways  are  summarized  in  Figure  3-6. 

Now  it  remains  to  see  how  this  works  in  practice!  Chapter  4  will  present 
some  necessary  statistical  tools  and  then  Chapter  5  will  present  the  model  “under 
fire”  . 
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CHAPTER  4 


Estimation  of  the  Bispectrum 


4-1  At  Least  Four  Techniques  are  Used  To  Estimate  the  Bispectrum. 

Now  it  is  time  to  consider  how  the  bispectrum  can  be  estimated  from 
sample  data.  Assume  a  discrete  time  series  is  somehow  obtained.  (If  the  model 
is  inherently  discrete,  then  the  bispectrum  is  estimated  for  the  entire  principal 
domain.  If  the  model  is  inherently  continuous,  then  it  should  be  anti-aliased 
filtered  before  sampling  (that  is,  all  frequencies  higher  than  twice  the  sampled 
frequency  should  be  removed)  and  it  should  be  estimated  only  for  the  smaller 
“support  set”  as  given  in  Chapter  2.)  In  either  case,  there  are  four  techniques 
which  are  in  common  use  for  estimating  the  bispectrum. 

The  first  technique,  due  to  T.  Subba  Rao  and  M.  Gabr  [4-3]  involves  com¬ 
puting  the  triple  correlation,  smoothing  it  appropriately,  and  then  transforming 
it  to  get  the  bispectrum.  This  is  the  natural  approach  based  on  the  Section  1.2 
definition  of  the  bispectrum.  The  Gabr-Rao  technique  is  documented  in  their 
book  and  will  not  be  discussed  further  in  this  report. 

The  second  technique  is  due  to  P.  Huber  and  his  co-authors  [4-1].  Ac¬ 
tually  there  are  several  variations  in  technique  here,  but  the  essential  idea  is  to 
estimate  the  Fourier  transform  of  the  time  series  and  then  find  the  expectation 
of  products  of  the  transforms.  This  is  the  natural  way  to  employ  the  definition 
of  the  bispectrum  presented  in  Section  1.3. 

The  third  technique  is  due  to  M.  Hinich  [4-8.1]  and  is  closely  related 
to  that  of  P.  Huber.  The  primary  difference  is  that  the  averaging  involved  in 
estimating  the  Fourier  transform  and  the  subsequent  averaging  of  these  estimates 
to  compute  the  bispectrum  are  distinct  in  the  Huber  approach  and  are  performed 
together  in  the  Hinich  approach. 

The  fourth  technique  is  very  recent.  It  is  due  to  M.  Raghuveer  [4-2]  who 
claims  it  has  much  higher  resolution  than  conventional  methods  (i.e..  techniques 
1-3  above)  especially  when  used  on  short  data  sequences.  The  technique  involves 
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assuming  an  underlying  model  for  the  process,  estimating  the  parameters  which 
characterize  this  model  and  then  computing  the  bispectrum  in  terms  of  the  esti¬ 
mated  parameters.  A  somewhat  odd  feature  of  this  technique  is  that  two  distinct 
models  are  used,  the  parameters  for  the  model  used  for  the  bispectrum  being 
distinct  from  those  used  to  estimate  the  power  spectrum.  This  technique  will  not 
be  further  discussed  in  this  report,  but  Raghu veer’s  paper  is  recommended,  not 
least  for  the  bibliography  which  contains  many  recent  applications  of  bispectral 
techniques. 

The  remainder  of  this  chapter  is  devoted  to  the  Hinich  technique  for 
estimating  the  bispectrum  from  sampled  data.  Particular  attention  is  paid  to 
both  the  theoretical  underpinnings  of  the  result  (the  author  has  great  difficulty 
in  using  a  formula  without  any  idea  of  its  origin)  and  its  use  in  practice. 

4-2  One  Must  Distinguish  the  Sample  Fourier  Transform  from  the 
Quantity  it  Estimates  -  the  Population  Fourier  Transform. 

Before  tackling  the  bispectrum,  it  is  wise  to  consider  a  much  simpler 
problem.  In  this  post-FFT  era,  it  is  simple  to  compute  the  (discrete)  Fourier 
transform  of  a  given  (discrete)  time  series.  But  what  does  this  FFT  tell  one? 
If  the  sampled  time  series  is  deterministic,  then  the  computed  FFT  is  the  true 
transform  (aside  from  effects  due  to  sampling  interval  and  sample  size).  However, 
if  the  sampled  time  series  is  stochastic,  then  the  computed  FFT  is  just  an  estimate 
of  the  desired  transform.  If  one  were  to  take  another  sample  one  would  get  another 
estimate.  Still  it  seems  likely  (and  is  true)  that  the  expected  value  of  the  FFT 
estimate  equals  the  true  (or  population)  transform.  What  is  surprising,  however, 
is  that  the  FFT  is  not  a  consistent  estimator  of  the  population  transform.  That 
is,  no  matter  how  long  a  time  series  one  takes,  the  standard  deviation  of  the 
FFT  estimate  for  a  given  frequency  does  not  approach  zero  (rather  it  remains 
constant).  In  other  words,  one  cannot  get  increasingly  good  estimates  of  the 
population  transform  merely  by  increasing  the  sample  size. 

Estimating  frequency  domain  quantities  is  therefore  a  more  subtle  mat¬ 
ter  than  would  appear  at  first  sight.  One  must  look  at  estimator  asymptotic  bias 
(does  the  expected  value  of  the  estimator  approach  the  population  quantity  as 
sample  size  increases?)  and  consistency  (does  the  estimator  get  better  as  one 
progressively  increases  the  sample  size  -  i.e.,  in  the  limit,  does  the  variance  of  the 
estimator  approach  zero?).  Moreover,  one  still  has  the  concerns  of  the  determin¬ 
istic  case:  in  particular,  estimator  range  (the  Nyquist  frequency  determines  the 
maximum  frequency  for  which  an  estimate  can  be  made)  and  especially  resolution 
(how  closely  spaced  in  frequency  can  one  get  reliable  estimates?). 
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Ideally,  one  would  like  to  have  a  procedure  that,  given  tolerable  bounds 
on  bias  and  estimator  standard  deviation  and  specified  resolution  and  frequency 
range,  would  specify  the  sampling  parameters  of  the  sampled  time  series  (number 
of  sampled  series  k,  number  of  values  in  a  single  sample  N ,  Nyquist  frequency 
fpe,  etc.).  It  may  not  be  possible  to  physically  attain  the  required  sampling 
parameters  -  computational  limits  or  sensor  limitations  may  intervene.  In  this 
case,  the  procedure  should  provide  some  idea  of  what  performance  is  possible. 

The  following  sections  derive  an  estimator  which  implements  such  a  pro¬ 
cedure  -  the  Hinich  estimator. 

4-3  Fanfare  for  Brillinger’s  Formula. 

This  section  is  devoted  to  an  exposition  of  one  formula.  This  formula  (if 
pressed)  can  answer  virtually  any  question  about  the  sampling  properties  of  the 
FFT  and  is  the  key  result  underlying  the  Hinich  estimator. 

Suppose  x(t)  is  a  discrete  time,  zero  mean,  stationary  stochastic  time 
series.  The  discrete  Fourier  transform  (DFT)  of  ar,  denoted  x,  is  defined  by 

x-i 

x(uoj)  =  z(n)  exp[— iu^n].  (4-3.1) 

n=0 

The  x  is  a  sample  quantity,  an  estimate  of  the  true  or  population  spec¬ 
trum.  Each  x(u>j)  estimates  the  Fourier  amplitude  at  u>j  =  2nj/N. 

The  statistical  properties  of  the  discrete  Fourier  transform  estimate  of  the 
frequency  spectrum  are  spelled  out  in  a  formula  due  to  Brillinger  and  Rosenblatt. 
Brillinger  presents  this  formula  without  emphasizing  its  usefulness.  Here  this 
formula  gets  its  due. 

The  formula  is  actually  an  infinite  set  of  equations,  one  equation  for 
each  number  of  sample  quantities  involved.  On  the  left  side  of  the  formula  is  an 
expectation  of  products  involving  at  most  n  sample  quantities.  On  the  right  side 
is  an  expression  involving  the  population  quantities  and  the  sample  size  N.  It  is 
possible  to  use  this  set  of  equations  to  determine  the  expectation  of  virtually  any 
combination  of  DFT  estimators  (as  will  be  seen  shortly). 

The  left  side  expectation  is  the  cumulant  of  n  FFTs.  Cumulants  have 
been  mentioned  occasionally  throughout  this  report,  but  only  now  is  it  time  to 
take  them  seriously.  Temporarily  suffice  it,  just  to  indicate  a  commonly  used 
notation  for  cumulants: 


i,3-2 . xnj. 


(4  3.2) 


The  next  two  sections  provide  the  necessary  background.  Using  this 
notation,  Brillinger’s  formula  [2-1]  is 

(4-3.3) 

Here  x,  denotes  x(u;t),  N  is  the  sequence  length,  n  is  the  number  of  individual 
frequencies  being  looked  at,  A(x)  is  defined  as 


[xt, . . . ,  in]  =  NA(uq  +  •  •  •  +  w„)/„(u>,, . 


i)  +  0(1). 


A(x) 


1  if  x  mod  2tt  =  0 
0,  otherwise 


(4-3.4) 


and  /„  is  the  population  cumulant  spectrum  defined  by 


/n(u>  i,...,u;n_1)  =  ^[x(<  +  u1),...,x(<  +  u„_1)]exp[-f(u;1ui  -f - h  ^n-iUn-t)], 

(4-3.5) 

where  the  summation  is  over  all  integral  values  of  the  {tq}  and  is  independent  of 
t  by  the  assumption  of  stationarity. 1 

Equations  (4-3.3)  and  (4-3.5)  are  simplified  and  less  general  versions  of 
those  presented  in  Brillinger.  A  factor  of27r1-”  has  been  removed  from  equation 
(4-3.3)  and  this  removes  a  corresponding  factor  from  equation  (4-3.5),  which  now 
provides  the  usual  definitions  of  the  population  spectrum  and  bispectrum  for 
n  =  1  and  n  —  2,  respectively. 

Basically  Brillinger’s  formula  states  that,  asymptotically  (i.e.,  as  N  tends 
to  infinity),  the  cumulant  of  the  transform  is  the  transform  of  the  cumulant.  This 
is  not  surprising:  the  bispectrum  is  initially  defined  as  the  transform  of  a  cumulant 
(Section  1.2)  but  it  is  also  equal  to  the  cumulant  of  the  transforms  (Section  1.3). 
Thus  one  should  expect  that,  asymptotically,  the  same  relation  will  be  true  of  the 
sampled  quantities. 

In  the  following  sections,  the  key  to  applying  Brillinger's  formula  is  to 
bear  in  mind  the  distinction  between  population  and  sampled  quantities,  and  for 
the  latter  to  pay  attention  to  the  highest  power  of  N  that  appears.  Hopefully,  this 
fact  shall  soon  become  clear. 

(Note  also  that  essentially  the  same  simplified  version  of  Brillinger’s  for¬ 
mula  has  been  presented  (and  proved)  in  a  readable  paper  by  Kim  and  Powers. 

[4-3-1]) _ 

'The  0(/(N))  notation  is  standard  and  means  that  terms  of  order  /(N)  or  lower  may  be 
present  but  have  not  been  explicitly  displayed.  In  this  case,  this  notation  means  that  constants 
(terms  independent  of  N)  or  terms  that  vary  as  an  inverse  power  of  N  (i.e.,  terms  which  become 
smaller  as  N  grows)  are  not  explicitly  shown,  Nonetheless,  for  large  N,  the  only  important  term 
is  the  one  shown. 


4-4  A  Diversion:  Cumulants  are  Simply  Expectations  with  Lower- 
Order  Dependence  Removed. 

Treatment  of  cumulants  has  been  postponed  to  this  section  because  it 
is  somewhat  intricate  and  was  not  necessary  until  now.  Further,  the  author 
hopes  that  the  reader  has  become  convinced  of  the  importance  of  cumulants 
sufficiently  to  tolerate  this  section  and  the  next.  However,  even  here  the  precise 
general  definition  will  not  be  given.  Rather  the  remainder  of  this  section  will  be 
pulled  primarily  from  Doo  Whan  Choi’s  Ph.D.  dissertation  [4.4-1]  which  gives  a 
particularly  intuitive  treatment  of  cumulants. 

Cumulants  are  essentially  expectations  with  lower  order  statistical  de¬ 
pendence  removed.  If  one  wants  the  cumulant  of  n  random  variables,  one  con¬ 
structs  the  expectation  of  the  product  of  all  n  variables  and  then  adds  terms  so 
that  the  net  result  will  vanish  if  any  subset  of  the  variables  is  independent  of  any 
other  subset. 

For  n  —  2,  this  shows  immediately  that 

[X1,X2]  =  (X!X2)  -  (x  i)  (x2) ,  (4-4.1) 

since  here  one  starts  with  the  expectation  (xtx2)  and  it  is  clear  that  the  right-hand 
side  of  equation  (4-4.1)  vanishes  if  xt  and  x2  are  independent. 

For  n  =  3,  one  would  then  start  with 

[xi,x2,x3]  =  (xiX2x3)  -  (xi)(x2)(x3)  -  other  terms  (4-4.2) 

which  vanishes  if  Xi,x2,  and  x3  are  mutually  independent.  But  suppose,  following 
Choi,  that  only  x3  is  independent  of  and  x2  and  that  xv  and  x2  are  not 
independent.  Then  the  right-hand  side  of  equation  (4-4.2)  (neglecting  the  other 
terms)  becomes 

(x,x2)(x3)  -  (x1)(x2)(x3)  =  [x j , x2] (x3) .  (4-4.3) 

Thus  the  term  (4-4.3)  must  be  included  in  the  other  stuff  as  well  as  the 
terms  one  gets  by  interchange  of  3  with  1  and  3  with  2.  The  final  definition  of 
the  cumulant  of  third  order  is 

[x,,x2,x3]  =  (xjx2x3)  -  (xi)(x2)(x3) 

-  [-r  i ,  -x2]  (-r  a) 

-  [x,,x3](.r2).  (1-4,1) 

One  can  see  that  this  procedure  may  rapidly  become  cumbersome  for 
larger  n  and  that  explicit  rules  are  needed  to  efficiently  compute  these  higher- 
order  cumulants.  However,  these  rules  themselves  are  fairly  complicated  and  the 
interested  reader  is  referred  to  the  references. 


The  remainder  of  this  section  is  for  the  more  knowledgeable  or  curious 
reader.  Notice  from  the  above  equation  that  for  ihe  zero  mean  case  the  third-order 
moment  and  the  third-order  cumulant  are  the  same.  It  is  usually  emphasized  that 
the  bispectrum  is  the  Fourier  transform  of  the  cumulant  rather  than  the  moment, 
in  spite  of  the  fact  that  these  quantities  are  equal. 

If  one  views  the  iispectrum  as  the  first  in  a  sequence  of  higher  order 
po/yspectra,  grounds  for  this  distinction  do  emerge.  In  the  higher  order  case, 
where  cumulants  and  moments  are  distinct  quantities,  there  are  two  reasons  for 
choosing  cumulants  over  moments. 

First,  cumulants  have  better  independence  properties  than  moments. 
Moments  contain  information  about  lower  order  moments,  whereas  cumulants 
are  constructed  in  such  a  way  that  each  order  cumulant  has  the  dependence  on 
lower  order  ones  removed.  For  example,  it  is  meaningful  to  set  all  cumulants 
above  the  second  order  to  zero  (and  thus  get  a  Gaussian  process),  whereas  it  is 
impossible  to  set  all  moments  above  any  specific  order  to  zero.  (It  is  true  that 
these  higher  order  moments  contain  no  additional  information  and  need  not  be 
specified  for  a  Gaussian  process,  but  they  are  not  zero  -  that  is  just  the  point. 
Incidentally,  there  is  a  theorem  which  states  that  any  non-Gaussian  process  has  an 
infinite  number  of  nonvanishing  cumulants.  Thus  if  the  bispectrum  of  a  particular 
process  is  non-zero,  it  must  be  true  that  some  higher  order  polyspectra  are  also 
non-zero.) 

Second,  Brillinger  has  shown  that  for  a  commonly  encountered  class 
of  processes,  Fourier  transforms  of  cumulants  are  better  behaved  than  Fourier 
transforms  of  moments.  He  first  shows  that,  if  the  moments  and  the  cumulants 
are  distinct,  then  either  but  not  both  of  the  corresponding  transforms  can  be 
mathematically  well-behaved  (in  the  sense  of  being  proper  functions  rather  than 
requiring  Dirac  delta  functions).  Next  he  shows  that  for  ergodic  processes,  the 
transforms  of  cumulants  are  proper  functions  and  therefore  transforms  of  mo¬ 
ments  must  involve  delta  functions.  (For  an  ergodic  process  the  time  average  of 
any  quantity  for  a  single  representative  sequence  equals  the  average  of  the  same 
quantity  at  a  fixed  time  but  taken  over  all  members  of  the  ensemble.  In  effect, 
any  representative  sequence  (or,  realization)  of  an  ergodic  time  series  eventually 
explores  the  entire  range  of  behavior  typical  of  its  ensemble.  Moreover,  ergodic 
processes  are  common  both  in  theory  -  because  it  is  convenient  to  be  able  to 
interchange  ensemble  and  time  averages  -  as  well  as  in  practice  because  the 
“range  of  possible  behavior”  in  typical  ensembles  is  typically  connected  so  that 
any  one  series  will  do  anything  possible  for  its  entire  class.) 


4-5  Everything  We  Need  To  Know:  Table  of  Zero  Mean  Cumulants 


There  are  many  rather  complicated  formulae  or  sets  of  rules  for  evalu¬ 
ating  cumulants  of  arbitrary  order.  For  present  purposes  it  is  simpler  to  merely 
tabulate  the  cumulants  that  shall  be  used  and  refer  the  reader  elsewhere  for  the 
general  rules.  The  ones  that  shall  be  used  in  the  following  sections  are  those  of 
orders  less  than  or  equal  to  six  and  for  random  variables  with  zero  expectation. 
The  notation  used  below  is  as  follows.  Rectangular  brackets  enclose  cumulants; 
angular  brackets  enclose  expectations.  The  braces  with  subscripted  numbers  are 
to  be  interpreted  as  instructions  to  replace  the  enclosed  term  with  the  sum  of  all 
distinct  terms  obtainable  from  the  enclosed  term  by  permutation  of  indices.  The 
subscript  outside  the  brace  denotes  how  many  terms  should  be  obtained.  (This 
notation  is  similar  to  that  used  by  Dr.  Choi  in  his  dissertation,  but  it  is  not  quite 
the  same.) 


[*i]  = 

[xbx2]  = 

[X|,  X2,X3]  = 

[*i,*2,x3,x4]  = 

[*^1 5  ®2t  ^5] 

[x  l ,  X2,  X3,  X4,  X5,  Xg]  — 


0 

( X[X2 ) 

(xlx2x3) 

(x,x2x3x4)  ~  {(xjx2}(x3x4)}3 
{x [X  2x  3x  4x  -  {(x1x2x3}(x4x5)}l0 
(XiX2X3X4XbXG)  -  {(x,x2x3x4)(x5x6)}  15 
-  {(xJx2x3)(x4x5x6)}10 
+  2{(x1x2)(x3x4)(x5x6)}15 


4-6  The  Practice:  Previous  Results  Allow  One  To  Estimate  the  Power 
Spectrum. 

The  foregoing  sections  will  now  be  put  to  good  use.  In  this  section,  the 
statistics  (expectation  and  variance)  of  the  usual  estimate  for  the  power  spectrum 
will  be  studied.  In  particular,  the  DFT  will  be  shown  to  be  an  it. consistent 
estimator  of  the  population  spectrum.  However,  the  main  purpose  of  this  section 
is  to  provide  practice  for  the  following  section  which  applies  the  same  techniques  to 
the  study  of  the  statistical  properties  of  the  Hinich  estimator  for  the  bispectrum. 

(Note  that  neither  this  section  nor  the  following  section  actually  carries 
through  the  analysis  to  the  end,  i.e.,  practical  estimation  of  the  spectrum  and 
bispectrum.  These  sections  are  intended  to  provide  only  an  understanding  of  the 
basic  results  underlying  the  practical  techniques.  Also  note-  that  all  time  series 


shall  be  assumed  to  be  zero  mean  so  that  the  cumulants  in  the  preceding  table 
will  be  applicable.) 

Define  the  “periodogram”  estimator  by 


p»M  = 


x(<*>i)x*(w,) 


It  is  more  convenient  to  study  the  more  general  quantity 
I.  /  \  x(uq)x(u>2) 

Ei(Wi,w2)  =  - - - . 

The  expectation  of  E(  is  given  by 

/T-  /  u  (*(wi)x(w2))  [x(w>,),x(u>2)] 

(E^w,, lj2))  =  - ^ - =  - n - ’ 

where  the  second  equality  follows  from  the  cumulant  table. 

By  Brillinger’s  formula,  this  is 

(Ei(uq,w2))  =  A(uq, tu2)/2(uq)  +  0(N  *) 


(4-6.1) 


(4  6.2) 


(4-6.3) 


so  that,  asymptotically, 


(P  l(a’i))  —  (E 


(4-6.4) 


(4-6.5) 


and  examination  of  (4-3.3)  shows  that  /2  is  the  population  spectrum  so  that  P| 
is  an  unbiased  estimator  of  the  population  spectrum.  For  future  use,  note  that 
equation  (4-6.3)  indicates  that 


<^i  ^  u>2 -*  (E,(uq,u;2))  =  0(N  '), 


(4-6.6) 


and  therefore  estimates  of  amplitudes  at  different  frequencies  are  asymptotically 
independent. 

Next,  the  variance  of  the  periodogram  estimator  is  desired.  One  wants 


<(P,(uq)  -  /2(uq))2)  =  ((P.fuq))2)  -  /2(uq). 

The  expectation  of  P/2  is 

P2  =  ((P  i (w, ) ) 2)  =  ( 1  / N 2 )  ((.f(cu1).f(u;i).f(-cu|)x(-u.'i)), 
which  is  contained  in  the  cumulant 

rt  —  ( 1  / N 2 )  [xl.x2,x;(,x.|]. 


(4-6.7) 


(4  6.8) 


1  6.9) 


y.yl 


(4-6.11) 


x3  = 


Xi 


From  the  cumulant  table 


C4  =  P2-  {(*i^2)<^3^4>}3, 

where  the  1/N2  times  the  symmetrized  sum  in  (4-6.11)  is 

(i(w1)x(u;1))(x(-u;1)x(-u;1)) 

+(*(wi)x(-a>1))(x(wi)x(-w1)) 

+(x(u>l)x(-u>,))(x(u>l)x(-u;1)). 

Keeping  only  the  highest  order  terms  in  N  gives 

(2/N2)  (x(w1)x(-w1))(x(o;1)x(-w1)), 


(4-6.12) 


(4-6.13) 


so  that 

c4  =  P2  -  (2/N2)  (x(o>i)£(-a;1))(x(h;|)£(-u;1)).  (4-6.14) 

The  A  function  is  automatically  one  since  uq  +  u;2  +  u;3  -f  uq  =  0,  so 
Brillinger’s  formula  gives 

c4  =  -“ll  +  e>(N~2).  (4-6.15) 

Therefore 


{(Pi  M)*)-/?M  =  (AM)2 

+(1/N)  /3(u;1,a;1,-u;1)  +  0(N-2)  (4-6.16) 

and,  as  N  — *  oo, 

((P1(u;1))2)-/2(u;1)  =  /2(u;1).  (4  6.17) 

The  right-hand  side,  being  a  population  quantity,  is  independent  of  N, 
proving  that  the  variance  does  not  vanish  asymptotically  and  that  the  “peri- 
odogram”  estimator  is  not  consistent.  This  result  explains  why  smoothing  and 
other  techniques  are  required  to  estimate  the  spectrum. 
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A  small  diversion  is  now  in  order.  Typically,  the  bispectrum  is  not 
presented  raw,  rather  it  is  normalized.  The  normalized  version  is  usually  called 
the  bicoherence.  Powers  defines  the  normalized  bispectrum  by 


b2(uq,u>2) 


|B(o?i,  ug)j2 

<|x(u;)|2>{|x(w,)x(u;2)|2)‘ 


(4-6.18) 


This  definition  of  bicoherence  has  the  property  that  6  is  confined  to  range 
between  0  and  1  and  is  very  nicely  interpreted  as  the  fraction  of  the  total  power 
at  u>  which  is  due  to  the  coupling.  (This  result  is  easy  to  derive:  write  down  62/P 
and  just  substitute  for  these  quantities  using  the  formulae  above.) 

Power’s  definition  of  the  bicoherence  is  not  the  only  one  possible:  in  fact, 
the  bicoherence  is  more  commonly  defined  by 


b2(uq,  u>2) 


_ |B(oq,  u>2)  j2 _ 

(|x(w)|2)(]x(a>,)|2)(|i(u;2)|2)‘ 


(4-6.19) 


If  the  x’s  were  population  quantities  rather  than  sample  quantities,  then  the 
two  definitions  would  be  equivalent  (the  joint  expectation  in  Power’s  definition 
would  factorize  as  shown  earlier).  In  general,  there  should  not  be  much  difference 
between  the  normalizations.  However,  the  problem  of  choosing  between  them 
still  remains.  There  is  little  discussion  in  the  literature  on  this  topic.  The  bulk 
of  the  statistical  literature  chooses  the  second  form  without  much  in  the  way  of 
explanation.  The  author’s  preference  is  for  Power’s  definition  because  it  possesses 
the  properties  above. 


4-7  The  Payoff:  Previous  Results  Allow  One  To  Estimate  the  Bi¬ 
spectrum. 

The  same  type  of  argument  can  be  used  to  derive  the  statistical  proper¬ 
ties  of  the  Hinich  estimator.  This  derivation  in  this  section  does  not  seem  to  have 
been  published  previously,  however.  Define  the  “raw  bispectral”  estimator  by 

F|(w,,u>2)  =  (1/N)  xfcvjxju.^);^^),  (4-7.1) 

where  uq  +  u>2  +  u>3  =  0  and,  for  simplicity,  uq  and  cv2  are  taken  to  lie  in  the 
interior  of  the  isosceles  triangular  subset  of  the  principal  domain.  One  now  wants 
the  expectation  and  variance  of  this  estimator. 

By  analogy  to  Section  4.6.  the  expectation  of  F  is  obtained  by  relating 
it  to  a  third  order  cumulant  and  then  using  Brillinger's  formula  to  evaluate  this 
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cumulant. 


(F(^i,w2))  = 


{j(a?t)z(u>2)x(a>3)) 


[x(ag),x(a;2),x(^3)] 

N 

=  /3(wl5u;2)  +  0(N-1), 

where  the  second  equality  follows  from  the  cumulant  table. 

Next,  the  variance  of  the  raw  estimator  is  desired.  One  wants 

((F(wt)w2)  —  f 3(^11  UJ2))2)  =  ((F(u>|,fa>2))2)  — /j  (u>l,UJ2). 

The  expectation  of  F2  is 


which  is  contained  in  the  cumulant 


where 


where 


c6 —  1’  ®3»  X5, 


Xi  =  x4  =  x(-Wx) 

x2  =  x(w2),  x5  =  x(-u>2) 
x3  =  x(w3),  X6  =  x(-U>3). 

From  the  cumulant  table  and  Brillinger’s  formula, 

c6  —  (l/N2)  [at  +  a2  +  a3  +  a4] 

=  (l/N) /6(wi,a;2,w3,u;4,u;5)  +  0(N  2), 


a  1  =  {x,x2x3x4x5x6}  =  (|xi|2|x2|2|x3|2) 

a2  —  { (x  iX2x3x4) 

«3  =  -{(xJX2X3)(x4X5X6)}|0 

«4  =  -{(Xix2)(x3x4)(x5x6)}15. 


(4-7.2) 


(4-7.3) 


((F(u;uuj2))2)  =  (l/N2)  ((x(w1)x(a;2)x(i«;3)x(-w1)x(-w2)x(-w3)),  (4-7.4) 


(4-7.5) 


(4-7.6) 


(4-7.7) 


(4-7.8) 


The  above  expressions  can  be  simplified  by  keeping  only  the  highest 
order  terms  in  N.  One  knows  that  the  highest  order  contribution  comes  from 


fc'XH'fc' 

■ 


Mt 


wM 


those  expectations  with  frequencies  that  sum  to  zero.  Moreover  the  assumption 
that  uq  and  lie  inside  the  isosceles  triangular  subset  of  the  fundamental  domain 
implies  that  uq,  u;2,  and  w3  are  mutually  unequal  so  that  “coincidental”  equalities 
are  absent.  One  finds 

ai  =  (|ii|2|i2|2|i3|2) 

a2  =  -[(|i1|2)(|x2|2|x3|2)  +  (|i2|2)(|iii2|i3|2) 

+  (|x3|2)(|x1|2|x2|2)] 

03  =  -|{ZiX2X3)|2 

a4  =  2{|xi|2){|x2|2){|x3|2).  (4-7.9) 

Examination  of  and  a3  validates  the  earlier  claim  that  the  expectation 
of  F2  is  contained  within  the  cumulant.  In  fact,  the  entire  desired  variance  in 
equation  (4-7.3)  is  just  (dj  +  a3)/N2  so  that  combining  equations  (4-7.6),  (4-7.7), 
and  (4-7.9)  gives 

((F(a>|,u>2))2)  —  /2(nq,  u2)  =  —  [ai  +  a-i]  +  /e(u>i>  “>2,  u>3,  uq,  uq)  +  0(N  2). 

(4-7.10) 

The  asymptotic  behaviors  of  a2  and  o4  are  easily  shown  to  be 

a2  =  2N3/2(uq)/2(uq)/2(u;3)  +  0(N2) 

«4  =  -3N3/2(uq)/2(cd2)/2(u>3)  +  0(N2),  (4-7.11) 

so  that  equation  (4-7.10)  simplifies  to 

((F(uq,  u;2))2)  ~  /3(w ii^2)  =  N/2(uq)/2(u>2)/2(u;3)  +  0(1).  (4-7.12) 

This  result  is  the  key  result  which  underlies  the  Hinich  estimator  and 
is  equivalent  to  equation  2.5  of  Hinich  under  the  simplifying  assumptions  made 
here.  (To  get  the  complete  result,  it  is  only  necessary  to  allow  uq  and  u;2  to  be 
arbitrary  and  then  include  the  “coincidental”  equalities  which  occur  in  evaluating 
the  cumulants  in  (4-7.7).) 

4-8  The  Mathematical  Results  Can  Be  Turned  into  a  Practical  Pro¬ 
cedure  for  Estimating  the  Bispectrum. 

The  previous  section  has  shown  that  F(uq,uq)  is  an  unbiased  but  not 
consistent  estimate  of  the  bispectrum.  It  does  not  consistently  estimate  the  bi¬ 
spectrum  because  its  variance  does  not  approach  zero.  In  fact,  by  equation  (4- 
7.12),  the  variance  actually  increases  with  N.  In  this  section  F(uq,uq)  will  be 
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turned  into  a  practical  (consistent)  estimator.  Unlike  the  earlier  sections,  the 
mathematics  will  be  stated  and  not  proved.  This  is  due  not  to  increased  com¬ 
plexity,  but  to  the  lesser  relevance  of  the  mathematics  for  present  purposes. 

For  simplicity  of  notation,  F(u>j,a>j(.)  will  be  denoted  F(  j,  k). 

To  obtain  a  consistent  estimate,  the  F(j,  k)  is  averaged  over  adjacent 
values  in  a  square  of  M2  points  centered  at  the  points 

.  .  /(2m  —  1)M  —  1  (2n  -  1)M  -  K 

\9mi9n)  ~  (  2  >  2  / 

=  where  m  =  l, . . . ,  (N/(2M)  +  0.5] 

=  andn  =  l,...,  min(m,  [N/M  —  2m  -f  3/2  -f  3/(2M)])j(4-8.1 ) 

where  M  is  determined  by  trading  off  resolution  versus  bias  (as  shall  be  seen 
below)  and  [  ]  denotes  the  “greatest  integer  not  exceeding”  function. 

The  values  of  m  and  n  above  are  constrained  so  that  the  center  points 
are  within  the  principal  domain.  The  constraints  on  m  and  n  are  derived  from 
the  principal  domain  constraints  on  the  set  of  ( j,k): 

0<j<  N/2, 0  <  k  <  j,  2j  +  k  <  N.  (4-8.2) 

If  not  all  points  within  the  square  are  within  the  principal  domain,  only 
those  points  within  the  principal  domain  are  included  in  the  average.  Thus  the 
bispectrum  estimator  is  given  by 

mM-l  nM-l 

B(m,n)  =  M-2  £  £  F  (j,k),  (4-8.3) 

j — ( 771  —  1 ) M  k—(n—  1)M 

subject  to  the  constraints  in  equation  (4-8.2). 

If  the  bispectrum  is  slowly  varying  over  the  square,  then  this  estimator 
is  unbiased: 

(B(m,  n))  =  B(/9m,/9J  +  0(M/N).  (4-8.4) 

If  the  power  spectrum  is  slowly  varying  over  the  band  of  width  M  cen¬ 
tered  at  the  appropriate  frequencies,  then  the  variance  of  this  estimator  is  given 

by 

Var  B(m,  n)  =  NM-4Q(m,  n)/2(/9m)/2(/5J/2(/^  +/,„)  +  0(M/N),  (4-8.5) 

where  Q(m,  n)  =  M2  if  the  square  is  entirely  within  the  principal  domain;  other¬ 
wise  it  is  equal  to  the  number  of  points  [j,  k)  within  the  square  but  not  on  the 
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boundaries  j  =  k  or  2j  +  k  =  N  plus  twice  the  number  of  points  (j,  k)  on  the 
boundaries. 

From  equation  (4-8.5)  it  can  be  seen  that  the  estimator  given  by  equation 
(4-8.3)  is  a  consistent  estimator  for  values  of  M  given  by 

v/N  <  M  <  N.  (4-8.6) 


1 


#80$ 
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The  bias  increases  and  the  variance  decreases  as  M  increases.  If  the 
time  series  is  divided  into  L  segments,  each  of  length  N,  and  the  bispectrum  is 
estimated  for  each  segment  and  all  L  estimates  are  averaged  together,  then  if 
each  of  the  L  bispectrum  estimates  is  uncorrelated,  the  variance  of  this  coherent 
average  is  just 

Var  B^m,  n)  =  Var  B(m,  n)/L.  (4-8.7) 

In  this  case,  consistency  is  obtained  for  values  of  M  given  by 

V/N/L  <  M  <  NL.  (4-8.8) 


The  asymptotic  distribution  of  the  estimator  given  by  equation  (4-8.3)  is 
complex  normal  and  independent  for  each  frequency  pair.  Thus  the  distribution 
of  the  random  variable 


X2(m,n) 


2|B(m,  n)|2 
Var  B(m,n) 


(4-8.9) 


is  noncentral  chi-square  with  two  degrees  of  freedom  and  noncentrality  parameter 


where 


A (m,  n)  = 


NM~4Q(m,  n) 


'  fgn)i 


lif 9m  1  f 9n) 


|B(m,n)|2 


f 2  (  f 9m  fgn  )  /2  (  f 9m  +  fgn) 


(4-8.10) 


(4-8.11) 


is  called  the  skewness  function.  An  estimate  of  the  normalized  bispectrum  given  in 
equation  (4-8.9)  can  be  obtained  by  using  an  estimate  of  the  power  spectrum  in  the 
expression  for  the  variance.  If  the  periodogram  estimator  for  the  power  spectrum 
is  used  and  it  is  smoothed  over  a  band  of  at  least  M  adjacent  values,  then  this 
estimate  of  the  normalized  bispectrum  will  also  be  noncentral  \2  distributed  with 
tw’o  degrees  of  freedom  and  the  same  noncentrality  parameter.  If  the  bispectrum 
is  coherently  averaged  over  L  segments,  then  the  distribution  of  the  normalized 
bispectrum  after  averaging  is  noncentral  chi-square  with  2  degrees  of  freedom 
and  noncentrality  parameter  LA(m,  n).  If,  on  the  other  hand,  the  bispectrum 
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is  normalized  as  in  equation  (4-6.19)  and  the  normalized  bispectrum  is  averaged 
over  L  segments,  then  the  distribution  of  this  incoherent  average  is  noncentral 
chi-square  with  2L  degrees  of  freedom  and  noncentrality  parameter  LA(m,n). 

If  the  original  time  series  is  a  linear  process  then  it  has  constant  skewness 
function.  If  it  is  a  Gaussian  process,  the  constant  value  is  zero.  Thus  for  a 
Gaussian  process  the  asymptotic  distribution  of  the  estimator  of  the  normalized 
bispectrum  is  a  central  chi-square  with  two  degrees  of  freedom.  This  allows  the 
values  of  the  normalized  bispectrum  estimator  to  be  thresholded  and  compared 
to  values  from  a  known  distribution  to  determine  when  statistically  significant 
values  of  the  bispectrum  estimator  are  occurring,  i.e.,  values  large  enough  that 
the  assumptions  of  Gaussianity  and  or  linearity  can  be  rejected  with  some  degree 
of  confidence.  Specific  tests  for  Gaussianity  and  linearity  have  been  developed 
using  this  estimator. 
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4—9  The  Practical  Procedure  Can  Be  Simplified  To  Give  Order-of- 
Magnitude  Estimates. 

In  practice,  the  formulae  given  in  the  previous  section  can  be  simplified 
even  further.  The  estimator  under  consideration  is  obtained  by  starting  with  L 
sequences  of  length  N  each.  For  each  sequence,  one  first  constructs  F(j,  k)  and 
then  averages  over  squares  with  length  M  to  get  B.  The  estimator  is  obtained  by 
taking  the  average  of  B  over  all  L  sequences. 

One  finds  that  the  estimator  bias  is  approximately 


M/(NL) 


and  the  estimator  standard  deviation  is 


(4-9.1) 


(4-9.2) 
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4-10  Estimating  the  Bispectrum  Requires  Smoothing. 

In  a  nutshell,  this  chapter  has  shown  that  just  as  periodogram  estimates 
need  to  be  averaged  to  get  a  consistent  power  spectrum  estimate,  so  must  the 
raw  bispectral  estimates  be  similarly  averaged.  Moreover,  a  particular  method 
for  doing  this  averaging,  the  Hinich  procedure,  has  been  studied.  The  Hinich 
procedure  averages  adjacent  raw  bispectral  estimates.  It  is  also  possible  to  average 
in  the  time  domain  to  get,  a  smoothed  bicorrelation  function  and  then  transform. 
The  results  presented  have  been  based  on  a  formula  by  Brillinger  which  states 
that  the  transform  of  a  cumulant.  is  the  cumiilant  of  the  transforms. 
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This  chapter  begins  the  treatment  of  practical  issues  which  are  needed  to 
successfully  apply  the  bispectrum.  The  following  chapter  extends  this  treatment 
to  a  worked  example. 
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CHAPTER  5 


The  Model  in  Practice 


5-1  A  Simple  Example  is  Given  by  Choosing  the  Bispectrum  Con¬ 
stant  in  a  Specified  Rectangular  Region  and  Zero  Elsewhere. 

A  simple  example  of  a  non-trivial  bispectrum  is  one  in  which  the  bi¬ 
spectrum  is  zero  except  for  a  rectangular  region  where  it  is  constant.  The  rect¬ 
angular  region  is  centered  at  A  A ,A2  and  its  sides  are  parallel  to  the  frequency  axes 
with  length  2AA[  in  the  uq  direction  and  length  2AA2  in  the  uq  direction.  The 
magnitude  of  the  bispectrum  shall  be  denoted  b  and  the  bispectrum  will  be  taken 
to  have  zero  imaginary  part.  (See  Figure  5.1.) 

The  complete  bispectrum  for  this  example,  defined  for  the  entire  uq,uq 
plane,  has  numerous  symmetries  that  have  been  presented  in  Chapter  2.  These 
symmetries  can  be  handled  by  the  methods  of  Chapter  3.  Here  it  suffices  to 
compute  the  double  Fourier  transform  of  the  function  which  looks  exactly  like 
that  shown  in  Figure  5.1  and  possesses  none  of  the  additional  symmetries  except 
for  the  27t  periodicity  in  uq  and  uq  and  the  conjugation  symmetry. 

This  double  Fourier  transform  is  defined  by 

A(r,  5)  =  — —J  J  A(uq,uq)exp[f(ruq  +  suq)]duqduq,  (5-1.1) 

where  the  integration  limits  are  — 7T  to  7 r  and 

1 ,  uq)  =  6{5’(uq;A[,AA  1  )S(u>2]  A2,  A  A2)  -f  5(uq;  A  j ,  —  A  A  j  )S  (uq;  —  A2,AA2)} 

(5— 1 .2) 

represents  the  rectangle  function.  (The  second  addend  provides  conjugation  sym¬ 
metry  and  is  included  to  ensure  reality  of  the  double  transform.)  The  function  S 
is  defined  by 

for  a  -b<  x  <a  +  b 


S(  j;  a;  b )  = 


otherwise 


(5-1.3) 
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The  Fourier  transform  in  equation  (5-1.1)  is  easy  to  evaluate.  One  has 

1  t\ i+AAi  r\  2+AA2 

h(r,s)  =  — -{  /  expfiruqlduq  /  expftsu^ldu^}  +  c.c.  ,  (5-1.4) 

47T  J Al -*AAi  ‘/A2“"4\A2 

where  c.c.  denotes  the  complex  conjugate.  Or,  explicity, 


r  aa,aa2 

O 

II 

<0 

II 

,,  v  26 

h(r,s)= 

AA1cos(sA2)sin(sAA2)/.s 

AA2cos(rAi)sin(rAAj)/r 

r  =  s,  s  ^  0 
r  /  0,  s  =  0 

,  cos(rAi  -f  sA2)sin(rAAi)sin(sAA2)/(rs) 

r,s  ^  0 

(5-i .5) 

If  the  rectangular  region  is  restricted  to  lie  entirely  within  the  isosceles 
triangular  subset  of  the  fundamental  domain,  then  the  methods  of  Chapter  3  can 
be  used  to  get  the  time  domain  correlation  function  c3, 

c3(r,s)  =  h(r,s)  +  h(s,r)  +  h(r  —  s,  —s)  +  h(s  —  r,—r)  +  h(—s,r  —  s)  +  h(—r,s  —  r). 

(5-1.6) 

From  this  c3  one  can  determine  the  corresponding  time  domain  coeffi¬ 
cients  g  using  equation  (3-3.1)  in  reverse.  Notice  that  g(r,s)  will  be  non-zero 
for  arbitrarily  large  r  and  s.  In  the  computer  implementation,  g(r,  s)  will  be 
truncated  to  zero  past  certain  bounds  R  and  S.  This  truncation  will  introduce  a 
“ringing”  effect,  which  is  familiar  from  Fourier  theory  and  which  can  be  alleviated 
by  using  windowing  techniques.  However,  that  level  of  sophistication  will  not  be 
needed  here. 

Thus  one  may  produce  the  rectangularly  shaped  bispectrum  with  the 
discrete  time  series  model  below: 

OO 

x{t)  =  e(t)  +  T](t)  +  ^  g(m ,  n)e(t  —  m)g(t  —  m  —  n),  (5-1.7) 

m=0 
n= 0 

with  g's  computed  using  equation  (5-1.6),  h's  given  by  equation  (5-1.5).  and 
g(t)  and  e(t)  independent  zero- mean  Gaussian  random  variables  with  standard 
deviation  a.  Next,  it  is  necessary  to  test  the  model. 

5-2  Computer  Simulations  Give  Good  Agreement  with  the  Model. 

Testing  the  preceding  model  involves  assembling  a  good  deal  of  software. 
Moreover,  any  difficulties  that  may  arise  may  be  attributable  either  to  inadequa¬ 
cies  of  the  software  or  to  mistakes  in  the  mathematical  development.  To  ensure 


Figure  5-2  The  “Predicted”  Bicorrelations.  (Test  Model) 

that  such  was  not  the  case,  a  very  simple  model  was  picked  to  start  with.  This 
model  is  defined  by  letting  a  =  0.8  and  taking  only  a  small  number  of  g's  to  be 
nonzero: 

.4  .7  -.2 

g  =  .6  -.5  .3  .  (5-2.1) 

-.3  .1  .4 

These  numbers  were  chosen  more  or  less  at  random.  Then  2000  time 
series  of  128  points  were  generated  from  equation  (3-1.2)  using  these  parameter 
values.  The  g  values  are  related  very  directly  to  the  bicorrelations  as  shown  in 
equation  (3-3.4)  so  that  the  model’s  predicted  bicorrelations  are  essentially  an 
echo  of  the  g's.  This  is  shown  in  Figure  5-2. 

Given  the  model  coefficients  one  can  compute  the  predicted  cor¬ 
relations  using  equation  (3-2.2),  the  predicted  power  spectrum  using  equation 
(3-4.3),  and  the  predicted  bispectrum  using  equation  (3-5.5).  One  may  generate 
the  “observed”  quantities  as  follows.  From  the  generated  time  series,  compute 
the  correlations  and  the  bicorrelations  directly  from  their  definitions  and  then 
Fourier  transform  these  quantities  to  get  the  corresponding  power  spectrum  and 
bispectrum.  (This  method  is  not  computationally  efficient,  but.  it  has  the  advan- 


Figure  5-3  The  “Observed”  Bicorrelations.  (Test  Model) 

tages  of  simplicity  and  of  reliability  when  developing  software  from  scratch.)  If 
everything  is  as  it  should  be,  the  observed  and  the  predicted  quantities  should  be 
in  close  agreement. 

These  computations  were  done  on  a  Micro  VAX  computer  at  Applied  Re¬ 
search  Laboratories  (ARL:UT).  The  software  which  generates  the  time  series  and 
does  the  comparison  of  prediction  versus  observation  is  written  in  VAX  Fortran 
and  has  been  termed  Bispectrum  Workshop. 

The  results  of  Bispectrum  Workshop  are  shown  in  the  following  set  of 
figures.  First,  Figure  5-3  shows  the  observed  bicorrelations.  Next,  side-by-side 
comparisons  of  the  correlations,  power  spectra,  and  bispectra  are  shown  in  Figures 
5-4,  5-5,  and  5-6,  respectively.  The  visual  agreement  is  very  good  is  all  cases, 
and  an  examination  of  the  actual  numbers  (not  presented  here,  but  available  from 
Workshop  output)  confirms  the  match.  Note  also  that  the  ‘la2  term  in  equation 
(3-4.3)  equals  1.28  which  lies  below,  but  near,  the  power  spectrum  curve  as  one 
might  expect. 

With  this  test  successfully  accomplished,  it  becomes  desirable  to  try  to 
reproduce  a  given  bispectrum.  A  second  program.  ModelMaker,  was  written  for 
this  purpose.  Rather  than  follow  the  analytical  treatment  in  Section  5-1,  this 
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Figure  5-7  Predicted  and  Observed  Bispectra.  (Square  Model) 

program  takes  a  supplied  continuous  frequency  bispectrum  as  input,  samples  it, 
double  Fourier  transforms  to  get  the  bicorrelation,  and  then  computes  the  y  s  from 
the  bicorrelations.  These  g's  are  output  to  a  file  which  Workshop  can  read.  (Note 
that  the  g's  are  only  determined  if  cr  is  assumed  known.  ModelMaker  arbitrarily 
sets  cr  =  l.  The  following  section  will  consider  this  matter  is  more  detail.) 

Here  a  model  of  the  form  shown  in  Figure  5-1  was  used,  with  A,  =  3/16, 
A2  =  9/32,  AA,  =  5/128,  and  AA2  =  15/128.  In  this  case  5000  times  series  of  128 
points  were  produced.  However,  the  sampling  was  intentionally  done  very  coaisely 
just  to  show  sampling  effects.  Samples  were  taken  at  frequencies  /,  =  ?7r/32. 

Side-by-side  comparisons  are  shown  in  the  following  figures.  First,  notice 
that  the  predicted  bispectrum  is  not  flat;  thus  one  cannot  fault  the  observed 
bispectrum,  which  is  in  good  agreement  with  it.  (Figure  5-7.) 

The  non-flatness  of  the  bispectrum  is  shared  by  the  power  spectra.  Here, 
however,  the  limiting  2/r2  value  is  2  and  this  value  is  actually  approached  quite 
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Figure  5-8  Predicted  and  Observed  Power  Spectra.  (Square  Model) 
closely.  (Figure  5-8.) 

Finally,  the  predicted  bicorrelations  look  much  as  one  would  expect  for 
this  model,  and  the  observed  bicorrelations  can  be  seen  to  agree  well.  (Figure 
5-9.) 

Thus,  computer  simulation  verifies  the  mathematics  presented  here  and 
shows  that  this  model  can  be  used  to  invert  the  bispectrum.  This  section  con¬ 
cludes  with  a  few  remarks  concerning  parameter  values. 

5-3  Reasonable  Parameter  Values  are  Obtainable  through  a  Com¬ 
bination  of  Mathematical,  Statistical,  and  Practical  Considera¬ 
tions. 


It  is  assumed  that  one  wants  to  reproduce  a  given  bispectrum  with  a 
model  of  the  form  given  in  equation  (3-1.2).  It  is  further  assumed  that  one  will 
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estimate  the  model  bispectrum  using  the  Hinich  procedure  described  in  Section 
4.  (The  Hinich  procedure  is  not  used  in  the  current  implementation  of  Workshop 
as  described  in  the  preceding  section.)  The  given  bispectrum  determines  the  g’s 
up  to  a  scale  factor.  (Were  cr  known,  this  constraint  would  determine  the  scale 
factor  as  well.) 

The  model  and  estimator  parameters  at  one’s  disposal  are 


<7,  g scale,  R,  S,  M,  N,  and  L. 


(5-3.1) 


First,  notice  that  the  given  bispectrum  can  be  double  Fourier  trans¬ 
formed  to  get  bicorrelations.  These  bicorrelations  should  be  examined  to  see  at 
which  lag  they  begin  to  decay.  The  values  of  R  and  of  S  should  be  chosen  to 
exceed  this  decay  lag.  Then  the  value  of  N  (the  number  of  points  in  a  time  series) 
should  be  chosen  large  with  respect  to  R  -j-  S. 

It  is  clear  that  in  the  present  formulation,  <r  and  an  overall  scaling  factor 
dscaie  can  be  traded  off  against  one  another  to  give  a  given  bispectrum.  This 
ambiguity  can  be  employed  to  one’s  advantage  by  choosing  the  a  which  yields 
the  smallest  estimator  variance  for  a  given  bispectrum. 

A  time  consuming,  but  exact,  method  for  doing  this  would  start  by 
computing  the  variance  of  (4-8.5),  using  the  expression  for  the  power  spectrum 
given  in  Chapter  3.  This  gives  the  variance  at  each  point  in  the  domain.  One 
can  now  define  some  overall  variance;  e.g.,  average  variance  or  perhaps  variance 
weighted  by  the  importance  of  regions  of  interest,  to  give  a  single  number.  Then 
one  can  vary  a  and  determine  which  value  minimizes  this  quantity. 

A  simpler,  but  approximate,  method  would  begin  by  approximating  the 
magnitude  of  the  given  bispectrum  by  some  number  B.  Next,  write  the  g's  as 


fif(m,n)  =  g scan  g'(m,n), 


(5-3.2) 


where  the  g'  are  the  g  values  which  correspond  to  cr  =  1  (and  hence  are  fixed  for 
a  given  bispectrum).  Now  compute  the  quantity 


£K(A)f 


(5-3.3) 


and  average  it  over  A  to  get  the  value  G.  With  these  approximations  one  can 
write  the  expression  for  the  power  spectrum  as 


h  =  2<r2  +  cr  4g2KaUG. 


(5-3,1) 
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Now  <7  and  gscau  can  be  determined  by  minimizing  /2  subject  to  the 
constraint  that  the  bispectrum  remain  constant  (or  that  cr4g3caie  =  B).  When 
one  does  this,  one  finds  that 

<7  =  [B2G\^.  (5-3.5) 

(And,  for  completeness,  one  gets  g3caie  =  B‘^G2^.) 

The  values  of  M  and  of  L  can  now  be  put  into  the  simplified  estimator 
equations  (4-9.1)  and  (4-9.2)  to  get  adequately  small  bias  and  standard  deviation. 

5—4  The  Model  Works! 

This  chapter  demonstrates  that  the  universal  bispectrum  model  can  be 
used  to  reproduce  a  desired  bispectrum.  The  equations  presented  in  the  earlier 
chapters  have  been  shown  to  be  consistent  with  computer  simulations.  Some 
discussion  of  parameter  values  has  been  presented.  The  discussion  as  presented 
here  is  still  incomplete,  however. 

There  are  alternate  routes  to  actually  employing  the  model  which  have 
not  been  discussed.  The  approach  taken  here  is  largely  a  time  domain  approach. 
A  corresponding  frequency  domain  approach  may  have  some  advantages.  This 
frequency  domain  approach  would  derive  the  g  from  the  given  bispectrum  as 
before,  but  then  instead  of  going  directly  to  the  g's,  the  series  would  be  computed 
in  the  frequency  domain  using  (3-7.14).  If  desired,  this  series  could  then  be 
transformed  to  get  the  time  series.  This  approach  is  likely  to  be  faster:  it  involves 
only  a  one-dimensional  integration  rather  than  a  two-dimensional  summation,  and 
it  is  likely  to  be  more  accurate:  agreement  between  prediction  and  observation 
does  not  depend  on  Fourier-transforming  random  quantities. 

Further,  the  model  can  be  naturally  extended  to  give  some  degree  of 
independent  control  of  the  power  spectrum  while  retaining  the  capability  of  re¬ 
producing  an  arbitrary  bispectrum.  One  obvious  way  of  doing  this  is  to  introduce 
a  third  independent  random  Gaussian  time  series  (( n )  and  add  a  moving  average 
contribution  from  this  variable  to  the  model  (3-1.2).  The  new  terms  will  not  affect 
the  bispectrum  but  will  contribute  additively  to  the  power  spectrum.  Moreover 
this  additional  contribution  to  the  power  spectrum  can  be  chosen  at  uill.  (One  is 
stuck  with  at  least  as  much  power  at  each  frequency  as  given  by  the  un-adorned 
model,  but  one  can  add  power  as  desired  to  re-shape  the  power  spectrum  without 
changing  the  bispectrum.)  An  alternative  means  of  gaining  this  control  is  to  add 
r/(m)e(n)  terms  for  m  >  n.  These  terms  were  neglected  in  the  current  formulation 
because  they  spoil  the  identifiability  of  the  model  and  they  allow  no  additional 
control  of  the  bispectrum:  g(m)e(n)  and  contribute  additively  to  the 
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bispectrum  so  that  one  term  can  do  duty  for  both.  But  these  two  symmetric 
terms  do  not  contribute  additively  to  the  power  spectrum.  Therefore  varying  the 
summands  while  keeping  the  sum  constant  will  change  the  power  spectrum  but 
not  the  bispectrum. 

There  are  other  interesting  questions  that  should  be  explored.  For  ex¬ 
ample,  is  there  a  minimum  power  spectrum  model  for  a  given  bispectrum,  and, 
if  so,  is  this  model  of  the  form  of  the  universal  bispectrum  model?  Can  this 
approach  be  extended  to  higher  order  spectra?  What  does  the  continuous  time 
formulation  of  this  model  look  like?  (In  fact,  the  continuous  time  version  will 
be  a  stochastic  integral  rather  than  a  stochastic  summation.)  The  author  will 
concede  that  the  the  discussion  of  practical  considerations  given  here  needs  to  be 
amplified  considerably. 


CHAPTER  6 


I 

Summary  I 


This  report  has  attempted  to  provide  an  intuitive,  uniform  treatment 
of  the  bispectrum.  Numerous  perspectives  on  the  bispectrum  were  presented 
with  emphasis  on  physical  and  mathematical  intuition.  A  limited  catalog  of 
examples  is  presented  in  an  appendix  to  the  report.  A  universal  model,  capable 
of  generating  any  possible  bispectrum,  was  worked  out  in  both  the  time  and 
frequency  domains.  An  example  of  its  use  was  presented.  Attention  was  given 
to  the  symmetries,  relationship  between  continuous  and  discrete  time  bispectra, 
and  the  mathematics  underlying  the  sampling  properties. 


APPENDIX  A 


Catalog  of  Examples 


A-l  Linear  Non-Gaussian  White  Noise  Passed  through  Linear  Device 


This  model  and  the  next  two  are  taken  from  the  paper  by  Huber,  Kleiner, 
Gasser,  and  Dumermuth  [4-1].  Assume  (A,)  =  0,  (Xf)  =  1,  and  (Xf)  =  (3. 


Define 

Yi  =  YJh(i-j)Xj. 

(A-l.l) 

Then,  if 

h(t)  —  [  exp(2ni\t)H(\)d\, 

Jo 

(A-1.2) 

f2(u)  =  \H(u)\2 

(A  1.3) 

6(u^  1,1^2)  —  ).//(u?2)-^*(^i  *4"  ^2)* 

(A-l. 4) 

A-2  Gaussian  Noise  Passed  through  Squarer. 

Assume  A,  is  Gaussian  with  (A,)  =  0  and  (Xf)  =  1  and  spectral  density 
g( A).  Pass  this  noise  through  a  squarer: 

Zi  =  Xi  +  aXf ,  (A-2.1) 

where  Z;  is  the  output  of  the  squarer,  a  is  the  amplitude  of  the  squared  part  and 
is  numerically  small. 

The  spectrum  of  Z,  is 

f2(\)  =  g(\)  +  2a2  J  g(n)g{\  -  n)dp  (A-2. 2) 

and  the  bispectrum  of  Z*  is 

f 3 ( ^ i > ^ 2 )  =  2a{5(A1)5r(A2)+5f(A1)(?(A1+A2)+sf(A2)3(A1  +  A2)}-|-0'3  terms  and  higher 

(A -2.3) 


A-3  Poisson  Activity 
Define 

Yi  =  J2l>(t~Tk.), 

k 

where  the  1\  are  Poisson-distributed  time's  with 


(A -3.1) 


This  model  [1.5-2J  has  nonlinearity  of  the  simple  form 

Y(u)  =  AYMYfa)  +  Y'.  (A— 5.3) 

Here  Y'  contributes  to  Y{uj)  but  is  statistically  independent  of  the  cou¬ 
pling  term.  A  represents  the  coupling  constant. 

Computing  the  power  spectrum  at  u,  one  gets 

PM  =  |/t|2(|V'M)l'M)p)  +  (|C'I2).  (A-  5.4) 


The  bispectrum  at  uq,  iu2  is 

B(aq,u>2)  —  A*  {\Y  (lo^Y  (ll>2)\2)  ■ 

A-6  A  Realistic  Evolution  Equation,  due  to  Powers. 

Here  a  realistic  evolution  equation  is  assumed: 

=  V<f>Wl<i>^exp(i6kx) 
ox 


(A-5.5) 


(A-G.l) 


with  frequency  matching  (u>  =  uq+uq),  possible  mismatch  in  wavenumber  (charac¬ 
terized  bv  8k  =  A  k^  -ku),  and  wave-wave  coupling  (with  coupling  coefficient 
v)  [1-5-1]. 

ft  will  be  assumed  that  the  above  equation  is  true  for  some  fixed  u>  and 
that  for  the  wave  at  this  frequency  uj  there  is  one  pair  of  frequencies  (uq  and  uq) 
i  hat  dominates  the  integral.  In  particular  the  frequency  dependence  of  V  can  be 
omitted. 

In  terms  of  the  quantities  of  experimental  interest,  the  above  equation 

becomes 

=  VY^  YU2+ik(u)Y(uj).  (A-6. 2) 


It  is  a  simple  exercise  (left  to  the  reader)  to  take  the  above  equation  and 


derive 


0P(u/) 


=  V  B(  •jj  1 .  u.’2 )  T  V  B  ( u>  1 .  to2 )  • 


In  terms  of  strictly  real  quantities  l  /j,  1 7,  B/f,  and  B/,  where 
2  V  =5  \  n  +  i  \  i-  atifll)  =  B/f  +  ;B/. 

This  gives  the'  wonderful  equation 


( A  - 1> .  3 ) 


(A  6,1) 


—  1  /fB/d^’i- “A*)  +  I  /B/(w  |,  m2). 


A  6.0) 


A-7  A  Stochastic  Model  of  Gear  Noises 


This  model  is  due  to  Sato  and  co-authors  [Appendix-1]  and  applied  to 
diagnosing  gear  noises  for  signs  of  wear.  The  model  as  given  is  a  continuous  time 
model.  One  has 

00 

x(t)  =  ^2  ancos(2imf0t  +  <j>n  +  dn)  (A-7.1) 

n=0 

where  u„  are  amplitudes  of  the  various  harmonics,  (f>n  are  deterministic  phases, 
and  the  <ln  are  random  phases. 

The  power  spectrum  for  this  model  is 

P(/)=  £  (A -7.2) 

n=— oo.n/O 

where  6  denotes  the  Dirac  delta  function. 

The  bispectrum  is 

B(/l,  f2)  =  Em.n=-oc..m.n7fO  ]/8  ( ,.<’Xp[i(^m  +  (p„  +  <P,n+n)\ 

(exp  [/'(</  in  “I"  dn  ^m+n  )])<5(/i  -  mf0)S{f2  -  »/o) 

(A  7.3 

so  that  one  sees  that  the  magnitude  of  the  various  terms  is  controlled  by  the 
expectation  of  the  random  phases.  In  particular,  if  the  random  phases  are  absent, 
then  the  expectation  has  its  maximum  value  of  one  and  the  bispectrum  shows 
the  peaks  most  clearly. 

A-8  Brillinger’s  Polyspectrum  Example. 

This  is  a  continuous  time  example  [Appendix-2]  which  comes  near  to 
being  a  generalization  of  the  universal  bispectrum  model.  One  takes 

X(t}  =  J  a(t  —  ii)d\V(u)  +  J  J  b(t  —  u,  t  —  v)d\V(u)dW(r)  (A-8.1) 

where  \V(t)  is  a  Wiener  process,  a  has  Fourier  transform  A.  b  is  assumed  sym¬ 
metric  in  its  arguments  and  has  double  Fourier  transform  77. 

The  bispectrum  of  X(t)  is 

2[  A(u.’|  )A(^2)B(  —  U/’i,  — u.^)  +  A(u;-2)./1(u;;j)/7(  — u.'|,  — u.'a) 

+  A(  uJ-|  )  A  (  )  /7( — — up)] 

-j-  ■s  j  Any  Pi  v  m  [  /  7  ( .  u.’  j  —  u.’ )  l  $  ( -j-  ox  —  u.’ )  /7(  u.'  —  us  j ,  —-  *jj  —  )  ]  du.1 , 

(A  S.2) 
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